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Abstract 

This paper introduces a framework of parametric descriptive directional types for con- 
straint logic programming (CLP). It proposes a method for locating type errors in CLP 
programs and presents a prototype debugging tool. The main technique used is checking 
correctness of programs w.r.t. type specifications. The approach is based on a generaliza- 
tion of known methods for proving correctness of logic programs to the case of parametric 
specifications. Set-constraint techniques are used for formulating and checking verifica- 
tion conditions for (parametric) polymorphic type specifications. The specifications are 
expressed in a parametric extension of the formalism of term grammars. The soundness of 
the method is proved and the prototype debugging tool supporting the proposed approach 
is illustrated on examples. 

The paper is a substantial extension of the previous work by the same authors concern- 
ing monomorphic directional types. 



1 Introduction 

The objective of this work is to support development of CLP programs by a tool 
that checks correctness of a (partially developed) program wrt an approximate 
specification. Failures of such checks are used to locate fragments of the program 
which are potential program errors. 

The specifications we work with extend the traditional concept of directional type 



for logic programs (see e.g. ( Bronsard et a/., 1992 )). Such a specification associates 
with every predicate a pair of sets that characterize, respectively, expected calls 
and successes of the predicate. Checking correctness of a logic program wrt direc- 



tional types has been discussed by several authors (see e.g. (Aiken & Lakshman 



1994 |Boye, 1996| ; |Boye k Maluszyhski, 1997] ; [Charatonik fc Podclski, 1998D and 
references therein). Their proposals can be seen as special cases of general verifica- 
tion methods of (Drabent & Maluszyhski, 1988; Bossi fc Cocco, 1989| ; Deransart, 



1993). Technically, directional type checking consists in proving that the sets spec- 



ified by given directional types of a program satisfy certain verification conditions 
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constructed for this program. For directional types expressed as set constraints the 
verification conditions can also be expressed as set constraints and the check can 



be performed by set constraint techniques (see e.g. (Aiken & Lakshman, 1994)). 

In this paper we propose an extension of directional types which addresses two 
issues: 

• CLP programs operate on constraint domains while (pure) logic programs are 
restricted to one specific constraint domain which is the Herbrand universe. 
Directional types of a logic program characterize calls and successes of each 
predicate as sets of terms. This is not sufficient for CLP where manipulated 
data include constraints over non-Herbrand domains. To account for that we 
use a notion of constrained term where a constraint from a specific domain is 
attached to a non-ground term. We define the concept of directional type for 
CLP programs using sets of constrained terms. 

• In logic programming, as well as in CLP, some procedures may be associated 
with families of directional types, rather than with single types. For example, 
typical list manipulation procedures may be used for lists with elements of 
any type and return lists with the elements of the same type. This is known as 
parametric polymorphism and can be described by a parametric specification, 
in our case by a parametric directional type. We extend the concept of par- 
tial correctness of CLP program to the case of parametric specifications and 
we give a sufficient condition for a program to be correct wrt a parametric 
specification. We apply this condition to correctness checking of CLP pro- 
grams wrt to parametric directional types, and for locating program errors. 
As shown by examples in Section ^, use of parametric specifications improves 
the possibility of locating errors. 

The problem of checking of polymorphic directional types has been recently for- 



mulated in a framework of a formal calculus ( Rychlikowski fc Trudcrung, 2000 



Rychlikowski fc Truderung, 2001 ). As explained in Section 7A_ that approach is 
substantially different from ours. 

A parametric specification can be seen as a family of (parameter-free) specifica- 
tions. As mentioned above, our specifications refer to sets of constrained terms. The 
sufficient conditions for correctness can be formulated as set constraints, involving 
operations on the specified sets, such as projection, intersection and inclusion. 

For constructing an automatic tool for checking correctness of specifications two 
questions have to be addressed: 

• How to represent sets so that the necessary operations can be effectively 
performed, 

• How to deal with parametric specifications. 



The first problem was already discussed in (Drabent et al, 2000b; 



2000a), which extends our earlier work (Comini et ai, 1998; Comini et ai, 1999) 



Drabent et al. 



We have chosen to represent sets of constrained terms by a simple extension of 
the formalism of discriminative term grammars, where sets of constrained terms 
are constructed from a finite collection of base sets. Term grammars (or equivalent 
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formalisms) and set constraints have been used by many authors for specifying and 



inferring types for logic programs (see among others (Mishra, 1984; Friihwirth et 



1991; Dart fc Zobel, 1992; Gallagher fc de Waal, 1994; Aiken fc Lakshman, 1994 



Boye, 1996| ; pevienne et al, 1997a| ; pharatonik fc Podelski, 1998| )). We show how 
the operations on discriminative term grammars can be extended to handle sets of 
constrained terms introduced by the extended discriminative term grammars. 
A solution to the second problem is a main contribution of this paper. We derive 



it by showing how the approach of (Drabent et al., 2000a) can be extended to 
the case of parametric specifications. (In our former work parametric grammars 
were used only in the user interface, to represent families of grammars.) First we 
have to give a new, more precise, presentation of that approach. We present a 
natural extension of the notion of partial correctness to the case of parametric 
specifications, so that the special case of parameterless specifications reduces to 
the notion used in our previous work. We introduce a concept of PED-grammar 
(parametric discriminative extended term grammar) as a formalism for specifying 
families of sets of constrained terms. We define operations on PED-grammars that 
make it possible to approximate results of the respective operations on members 
of the so defined families. We use them for checking correctness of programs wrt 
parametric directional types, and for locating potential errors. 

If the verification conditions of a logic program are expressed as set constraints, it 
is possible to infer directional types that satisfy them. For example, the techniques 



of ( Heintze fc Jaffar, 1990a ; Heintze fc Jaffar, 1991 ) make it possible to construct a 



term grammar]^ describing the least model of the set constraints. The use of these 



techniques for program analysis in general was discussed in ( Heintze, 1992|) . 

On the other hand, it is possible to use abstract interpretation techniques to infer 
directional types of a program. Soundness of an abstract interpretation method 
can be justified by deriving it systematically from the verification conditions. An 



example of an abstract interpretation approach is ( Janssens fc Bruynooghe, 1992 



Van Hentenryck et al., 1995 ). A technique of ( Gallagher fc de Waal, 1994 ), similar 
to abstract interpretation, derives types in a form equivalent to discriminative term 



grammars. In (Drabent et al., 2000a) we modified the latter technique to infer 
directional types for CLP programs. In this paper we present its further extension 
for inferring parametric directional types. We prove that this extension is sound in 
the sense that the program is correct wrt the inferred parametric types. 

We use our technique of parametric type checking for locating errors in CLP 
programs. More precisely, we check correctness of a program wrt a parametric spec- 
ification of directional types and we indicate fragments of clauses where the check 
of the verification conditions fails. However, CLP languages are often not typed so 
that programs do not include type specifications. Therefore our methodology does 
not require that the type specification is given a priori. The user decides a posteriori 
whether or not to type check a program, or its fragment. 

The type specification is usually provided in a step-wise interactive way. At each 



In general this grammar is non-discriminative. 
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stage of this process the program is checked against the fragment of the specifica- 
tion at hand. So incremental building of the specification is coupled together with 
locating errors. Even small fragments of the specification are often sufficient to lo- 
cate (some) errors in the program. On the other hand, if no program errors have 
been located when the specification is completed then the program is correct (wrt 
the specification). Notice however that not every error message corresponds to the 
actual error in the program. That is why we call the error messages "warnings" . 
This is due to using approximated specifications and to approximations made in 
the process of checking. 

In the proposed methodology the process of type specification is preceded by 
static analysis which infers directional types of the program. The inferred types 
may provide indication that the program is erroneous. In this case the user may 
decide to start the process of specification and error location. The results of the 
type inference may facilitate it, as discussed below and in Section Thus, in our 
methodology type inference plays only an auxiliary, though useful, role. 

The methodology is supported by a prototype error locating tool. The present 
version of the tool works for a subset of the constraint programming language 



CHIP (Cosytec, 1998). However, it can be easily adapted for other CLP languages. 



The structure of the tool is illustrated in Fig|l|. The tool includes a type checker, a 
Entry Program 
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Specification 
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Fig. 1. The structure of the error locating tool 



type inferencer and a specification editor. The tool has also a library of PED gram- 
mars. Among others, the library provides descriptions of often occurring types and 
specifications for built-in predicates. The specification of a program is introduced 
through the editor. It may refer to library grammars and/or to grammars provided 
by the user together with the checked program. 

The input consists of a (possibly incomplete) CLP program and of an entry 
declaration. The latter is a parametric specification of intended (atomic) initial 
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calls in terms of some PED grammar. In this way a family of sets is specified. 
Each member of the family is a different set of intended calls, corresponding to a 
different use of the program. The type inferencer constructs parametric directional 
types for all predicates of the program, thus providing a specification such that the 
input program is correct wrt to it. However, these types may not correspond to 
user intentions. This is due to program errors or to inaccuracy of type inference. 

The intended types have to be provided by the user. They are introduced in a 
step-wise interactive manner. When providing the type of a predicate the user may 
first inspect the inferred type and accept it, or specify instead a different type. The 
tool monitors the process and immediately reports as an error any violation of the 
verification conditions for the so far introduced types. 

While our approach makes it possible to locate some errors in CLP programs it 
should be clear that it is limited: 

• It locates only type errors. 

• Our types are based on discriminative regular grammars; the expressive power 
of this formalism is limited. 

• To deal with constraints we extend this formalism from terms to constrained 
terms. However our treatment of constraints is rather crude. Roughly speak- 
ing, our formalism is able to define only a finite collection of sets of constraints 
(for any given variable). This limited approach lets us however find typical 



type bugs related to constraints. In our former work (Drabent & Pietrzak, 



1998) we studied a more sophisticated (non parametric) type system for con- 



strained terms. It seems however too complicated. Charatonik (1998) showed 
that a certain approach to approximating the semantics of CLP programs is 
bound to fail, as the resulting set constraints are undecidable. 
• Correctness wrt parametric type specifications requires type correctness for all 
values of the type parameters. Thus only quite general sufficient conditions for 
correctness are possible. They however seem to work well on typical examples. 

A usual question discussed in the literature is the theoretical worst case com- 
plexity of the proposed type checking and type inference algorithms. We show that 
our type checking algorithm for a clause is exponential wrt the number of variable 
repetitions. In our approach to locating errors type inference plays an auxiliary role 



and is implemented by an adaptation of the algorithm of (Gallagher & de Waal 



1994) with some ideas of (Mildner, 1999). While we prove soundness of this adap- 



tation, we do not elaborate on the theoretical complexity issues, which by the way 
were not discussed by the authors of the algorithm. As concerns practical efficiency 
of our implementation, it turns out to be satisfactory on all examples we tried so 
far. 

The main original contributions of the paper are: 

• formulation of the concept of partial correctness of CLP programs wrt para- 
metric specifications, 

• a method for proving such correctness, 

• a technique for checking of parametric directional types for CLP programs, 
based on this method, 
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• a prototype tool for locating program errors based on this technique. 

The paper is organized as follows. Section |^ surveys some basic concepts on set 
constraints and constraint logic programs. Section |^ discusses the notion of cor- 
rectness of a CLP program with respect to a specification, a sufficient condition 
for partial correctness and a technique for constructing approximations of program 
semantics. The main contributions of the paper are presented in the next sections. 
Section ^ introduces PED Grammars to be used as a parametric specification for- 
malism for CLP programs. Section ^ introduces the notion of correctness wrt to 
a parametric specification and presents a method for proving such correctness. It 
shows how correctness can be effectively checked in case of parametric specifications 
provided as PED grammars. It also discusses how to construct a parametric speci- 
fication of a given program. Finally it explains how program errors can be located 
by failures of the parametric correctness check. Section |^ discusses the prototype 
tool and illustrates its use on simple examples. Section ^ discusses relation to other 
work and presents conclusions. 

This paper is an extended version of a less formal presentation of this work in 
(iDrabent et at, 200 

2 Preliminaries 

In this section we present some underlying concepts and techniques used in our ap- 
proach. We introduce set constraints and term grammars. They are a tool to define 
sets of terms. Then we generalize them to define sets of constrained terms. The sec- 
tion is concluded with an overview of basic notions of constraint logic programming 
(CLP). 

2.1 Set Constraints 

This section surveys some basic notions and results on set constraints. We will 
extend them later to describe approximations of the semantics of CLP programs 
and to specify user expectations about behaviour of the developed programs. 

We build set expressions from the alphabet consisting of: variables, function sym- 
bols (including constants), the intersection symbol n and, for every variable X , the 
generalized projection symbol 

A set expression is a variable, a constant, or it has a form /(ei, . . . , e„), ei fl 62, 
or t~^(e), where / is an n-ary function symbol, e,ei, . . . ,e„ are set expressions, 
4 is a term and X a variable. Set expressions built out of variables and function 
symbols (so including neither an intersection symbol nor a generalized projection 
symbol) are called atomic. 

Set expressions are interpreted over the powerset of the Herbrand universe defined 
by a given alphabet. A valuation that associates sets of terms to variables extends 
to set expressions in a natural way: H is interpreted as the intersection operation, 
each n-ary function symbol {n > 0) denotes the set construction operation 

f{Si, ...,Sn) = { fih, . . . ,t„) \ti e Si, i = l,...,n} 
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(for any sets Si, . . . ,Sn of ground terms) and symbol t ^ denotes the generalized 
projection operation 

t-^{S) = {Xe \ te £ S, e is a substitution, XO is ground}. 

(for any term t, variable X and set S of ground terms) 

Notice that we do not need special symbols for the projection operation and for 
the set of all terms. The latter is the value of t~'^ (S), where X does not occur in t 
and some instance of t is in S. Projection, defined as /(~)^(S') — {ti \ f{ti, . . . , i„) G 
S}, can be expressed as /(7)^(S') = f{Xi, . . . , Xn)~^^{S). 

Set expressions defined above are a proper subset of some classes of set expressions 
discussed in literature. In particular t^^ (S) (where X occurs in t) is a special case 
of the generalized membership expression of ( [Talbot et al, 200^ ), in the notation 
of that paper it is {X | 3-x t G S}. An (unnamed) operation more general than 



t ^ has also been used in (Hcintze & Jaffar, 1990b) 



Our choice of the class of set expressions is guided by our application, which 
is parametric descriptive types for CLP programs. Later on we generalize set ex- 
pressions to deal with sets of constrained terms (instead of terms) and to include 
parametric set expressions. 

The set constraints we consider are of the form 

Variable > Set expression 

An interpretation of set constraints is defined by a valuation of variables as sets of 
ground terms. A model of a constraint is an interpretation that satisfies it when 
> is interpreted as set inclusion D. Ordering on interpretations is defined by set 
inclusion: / < /' iff I{X) C I'{X) for every variable X. In such a case we will say 



that /' approximates /. It can be proved (see for instance ( Talbot et al., 2000 ) and 
Proposition ^.9[ ) that a collection G of such constraints is satisfiable and has the 
least model to be denoted M.g- The value of a set expression e in the least model 
of G will be denoted by [ejc; the subscript may be omitted when it is clear from 
the context. 



2.1.1 Term Grammars 

A finite set of constraints of the form 

Variable > Atomic set expression 

will be called term grammar. The least model of such a set of constraints can be 
obtained by assigning to each variable X the set of all ground terms derivable from 
X in this grammar. The derivability relation =^*q of a grammar G is defined in a 
natural way: some occurrence of a variable X in a given atomic set expression is 
replaced by a set expression e such that X > e is a constraint in G. Then \X'\g is 
the set of all ground terms derivable from X va G. 

A set S is said to be defined by a grammar G if there is a variable X of G such 
that S = \X\g. a grammar rule X > t will be sometimes called a rule for X. 



8 



Wlodzimierz Drabent, Jan Maluszynski and Pawel Pietrzak 



Example 2.1 

For the following grammar the elements of |iist] can be viewed as lists of bits. 

List > nil B > 

List > cons{B, List) B > 1 

A pair {X, G) of a variable X and a grammar G uniquely determines the set 
\X\g defined by the grammar; such a pair will be called a set descriptor (or a type 
descriptor). Sometimes we will say that {X, G) defines the set [X]g- By {X)g we 
denote the collection of all rules of G applicable in derivations starting from X. 

We will mostly use a special kind of term grammars. 

Definition 2.2 

A term grammar is called discriminative iff 

• each right hand side of a constraint is of the form /(A"i, . . . , where 
Xi, . . . , Xn are variables, and 

• for a given variable X and given n-ary function symbol / there is at most one 
constraint of the form X > f {. . .) 

It should be mentioned that discriminative term grammars are just another view 



of deterministic top-down tree automata (Comon et al, 1997). Variables of a gram- 



mar are states of an automaton, grammar derivations can be seen as computations 



of automata. Abandoning the second condition from Definition 2.2 leads to a strictly 
stronger formalism of non discriminative grammars equivalent to nondeterministic 
top-down tree automata. 

We should explain our choice of the less powerful formalism of discriminative 
grammars. They seem to be sufficient to describe those sets which are usually 



considered to be types (Aiken & Lakshman, 1994) and also easier to understand 



for the user, which is important in our application. One of the goals of this work 
is enhancing term grammars with parameters. It seems reasonable to begin with a 
simpler formalism. We also want to find out to which extent a simpler formalism is 
sufficient in practice. 



2.L2 Operations on Term Grammars 

The role of discriminative grammars is to define sets of terms. One needs to con- 
struct grammars describing the results of set operations on such sets. In this section 
we survey some operations on discriminative grammars, corresponding to set op- 
erations. A more formal presentation is given in Section ^ where we introduce a 
generalization of term grammars. 

Emptiness check. A variable AT in a grammar G will be called nullahle if no 
ground term can be derived from AT in G. In other words, \X'\g = iff X is unliable 
in G. To check whether |A"]g = 0, one can apply algorithms for finding unliable 



symbols in context-free grammars. This can be done in linear time (Hopcroft et al. 
pMl| ). 



Let G' be the grammar G without the rules containing unliable symbols. Both 
grammars define the same sets, \X\g = \X\g' for any variable X. 
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Construction. If Si,...,Sn are defined by (Xi, Gi ),..., G„), where 
Gi , . . . , G„ are discriminative grammars with disjoint sets of variables then the 
set f{Si,...,Sn) is defined by {X,G) where G is the discriminative grammar 
{X > f{Xi, . . . , Xn)} U Gi U . . . U G„ and X is a new variable, not occurring 
in Gi, . . . , G„. 

Intersection. Given sets S and T defined by discriminative grammars Gi and 
G2 wc construct a discriminative grammar G such that S HT is defined by G. 
Without loss of generality we assume that Gi and G2 have no common variables. 
The variables of G correspond to pairs {X, Y) where X is a variable of Gi and Y 
is a variable of G2. They will be denoted XCiY. The notation reflects the intention 
that liX,Y)]G = lXUnlYU. 

Now G is defined as the set of all rules 

xhF>/(XihFi,...,x„hr„) 

such that there exist a rule X > f{Xi, . . . , X„) in Gi and a rule Y > /(Yi, . . . , F„) 
in G2. Notice that for given / at most one rule of this form may exist in each of 
the grammars. Thus G is discriminative. It is not difficult to prove that [(X, y)]^ 
is indeed the intersection of |X]gi and |F]g2- 

We have S = [XJd for some X of Gi and T = [^Icg ^oi some F of G2, hence 
S (IT is defined by G. Notice that G may contain nuUable symbols even if Gi , G2 
do not. 

Example 2.3 

Consider two grammars 

Gi : X > a G2 : y > a 

X>f{Z,Z) Y>f{E,Y) 
Z>f{X,X) E>a 
Z >b E>b 
Z > g{Z) E > h{E) 

The grammar defining the intersections of the sets defined by Gi , G2 is 

G: Xr\Y>a 

XfW > f{znE,zr\Y) 
zriY > f{xnE,xnY) 
xnE > a 
znE > b 

Union. It is well known that the union of sets defined by discriminative gram- 
mars may not be definable by a discriminative grammar; take for example the sets 
{/(a, b)} and {/(c, d)}. Given sets S and T defined by discriminative grammars Gi 
and G2 we construct now a discriminative grammar G defining a superset of SUT. 

Without loss of generality we assume that Gi and G2 have no common variables. 
The variables of G correspond to pairs {X, Y) where X is a variable of Gi and Y 
is a variable of G2. They will be denoted XUF. The notation reflects the intention 

that [x]G,u|yiG, c i(x,y)iG. 



10 



Wlodzimierz Drabent, Jan Maluszynski and Pawel Pietrzak 



Now G consists of the rules of Gi, the rules of G2 and of the least set of rules 
which can be constructed as follows: 

• liX > f{Xi, Xn) is in Gi and Y > /(Yi, . . . , r„) is in G2 then XijY > 
fiXiOYi,...,X^UYn) is inG, 

• If X > f{Xi, . . . , Xn) is in Gi and no rule Y > /(Yi, . . . , K„) is in G2 then 
XijY > f{Xi,...,X„) is in G, 

• If no rule X > f{Xi, . . . , Xn) is in Gi and Y > /(li, . . . , Yn) is in G2 then 
Xur > /(Yi,...,r„) is inG 

It is not difficult to see that the obtained grammar G is discriminative, and that 
|XUF]g is indeed a superset of the union of |-^1gi and I^Jg^. If the first case 
is not involved in the construction the result is the union of these sets. If Gi,G2 
do not contain unliable symbols then [XOyjc is the tuple-distributive closure of 
|X]gj U|F]g2 j i-S- the least set definable by a discriminative grammar and including 
U |Y]g2- (We skip a proof of this fact, we do not use it later). So we are 
able to obtain the best possible approximation of the union by a discriminative 
grammar. 

Example 2.4 

The singleton sets {/(a, b)} and {/(c, d)} can be defined by the grammars: 
Gi : X> f{A, B), A> a, B > b G2 : F > /(G, D), C > c, D > d. 
Applying the construction we obtain additional rules: 

x(jY > /(Aug, blid) Aug > a blid > b 

Aug > c BUD > d 

Set inclusion Given sets S and T defined by discriminative grammars it is 
possible to check S* C T by examination of the defining grammars. 

By the assumption S = {XIq-^^T = |Y]g2 for some discriminative grammars 
Gi, G2 and some variables X, Y. We assume without loss of generality that Gi, G2 
do not contain unliable symbols. (Otherwise the nuUable symbols may be removed 
as justified previously). 

It follows from the definition of the set defined by term grammar that |^|gi ^ 
|yjG2 iff for every rule of the form X > f{Xi, . . . ,X„) in Gi there exists a rule 
Y > f{Yi, . . . , Yn) in G2 and [X^Jgi C |yi]G2 for i = 1, . . . , n. This corresponds to 
a recursive procedure where a check for X, Y corresponds to comparison of function 
symbols in the defining rules for X and Y, which may cause a failure, and a recursive 
call of a finite number of such checks. The check performed once for a given pair of 
variables need not be repeated. As the grammar is finite there is a finite number of 
pairs of variables so that the check will terminate. 



For a formal description of the algorithm and a correctness proof see Section 4.4.5 
where a more general inclusion check algorithm is presented. 

Example 2.5 

The following example illustrates inclusion checking. It shows that the set of non- 
empty bit lists with even length is a subset of the set of unrestricted lists which 
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allow a more general kind of elements. Both sets are described by discriminative 
grammars. 

S > cons{B, Odd) List > nil 

Odd > cons{B, Even) List > cons{E, List) 

Even > nil E > 

Even > cons{B, Odd) E > 1 

B>0 E> s{E) 

B > 1 

We check inclusion [S*] C |Lisi] . We show steps of this process. Each step will be 
characterized by three items: the checked pair of variables, the function symbols in 
their defining rules, the set of pairs to be checked after this step. 

{S,List) {{cons}, {nil, cons}) {{B, E), {Odd, List)} 

{B,E) ({0,l},{0,l,s}) {{Odd,List)} 

{Odd, List) {{cons}, {nil, cons}) {{Even, List)} 

{Even, List) {{nil, cons}, {nil, cons}) 

Generalized projection. Assume that S = I^Jg is defined by a discriminative 
grammar G. We show that t^-^ {S) is defined by a discriminative grammar. 

Consider a term t and a mapping ^{t, G, Y) assigning a variable of G to each 
subterm occurrence u of t, such that Vt is Y and if u = /(ui, . . . ,Un) {n > 0) 



then there exists a rule Vu > /(V„j, . . . , T4„) in G. So for instance in Example ^ 
taking t = cons{s{X), Z) and Y — List results in Vt = List, Vs(^x) = E, Vz = 
List, Vx = E. If such a mapping exists then it is unique, as the grammar contains 
at most one rule F >/(...) for given V, f. 

The mapping can be found by an obvious algorithm. It traverses t top-down 
and for each occurrence it of a non-variable subterm it finds the unique rule Vu > 
f{Vu^ , ■ • ■ , Ki„ ) ■ The rule determines the variables , • ■ • , Ki„ corresponding to 
the greatest proper subterms of u. If such a rule does not exist, mapping ^{t, G, Y) 
does not exist. The starting point is u = t and Vu =Y. 

Notice that if t6' e 5 then £^{t,G,Y) exists and ud e iKiJc for each subterm 
occurrence u in t. Hence Xd e [Vxilc for each occurrence X* of X in t. Thus 
t-^{S) C fl^ [Vx-Ig- (If X does not occur in t then fl^ (Vx^G denotes the Her- 
brand universe.) On the other hand, assume that ^{t, G, Y) exists and for each 
variable Z oi t there exists a term uz such that uz G [V^^Jg for each occurrence 
Z' of Z in t. Then te G S, where 6 = { Z/uz \ Z occurs in t }. Thus if ^{t, G, Y) 
exists and [V^^Jg is nonempty for each Z then 

t-''{s) = f]lVx4G. 

i 

Otherwise t'^{S) = 0. 

Applying algorithms described previously, we can construct for each Z a distribu- 
tive grammar Gz defining \Z'\cz = Hi I^Z'Ig and check this set for emptiness. 
This provides an algorithm which, given G, Y, t, produces for each X occurring in 
t a discriminative grammar Gx and a variable X' such that t^^{S) = I-^^'Igx- 
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An algorithm similar to the presented above is used in the implementation of 



(Gallagher & de Waal, 1994), it is however only superficially described in that 
paper. 



2.2 Specifying sets of constrained terms 

Set constraints and term grammars are formalisms for defining subsets of the Her- 
brand universe. This is not sufficient for the purposes of CLP. We use a CLP 
semantics based on the notion of a constrained expression. The goal of this section 
is generalizing discriminative term grammars to a mechanism of defining sets of 
constrained terms. 



2.2.1 Constrained expressions 

CLP programs operate on constraint domains. A constraint domain is defined by 
providing a finite signature (of predicate and function symbols) and a structure V 
over this signature.]^ Predicate symbols of the signature are divided into constraint 
predicates and non- constraint predicates. The former have a fixed interpretation in 
V, the interpretation of the latter is defined by programs. All the function symbols 
have a fixed interpretation, they are interpreted as constructors. So the elements of 
V can be seen as (finite) terms built from some elementary values and the constant 
symbols by means of constructors. That is why we will often call them 2?-terms. In 
CLP some function symbols have also other meaning (like + denoting addition in 
CLP over integers) . This meaning is employed only in the semantics of constraint 
predicates. 

We treat function symbols as constructors, because this happens in the semantics 



of most CLP languages, like CHIP or SICStus Prolog (|Cosytec, 1998| ; |SICS, 1998|) . 
They use syntactic unification. For instance, in CLP over integers, terms like 1 + 3, 
2 + 2, 1 * 4, 4 are (pairwise) not unifiable. Only the constraint predicates recognize 
their numerical values. So 2 + 2 #= 1*4 succeeds and 2 + 2 3*4 fails 
(where #=, #> are constraint predicates of, respectively, arithmetical equality 
and comparison). 

By a constraint we mean an atomic formula with a constraint predicate, Ci AC2, 
ciVc2, or 3A"ci, where ci and C2 are constraints and X is a variable. We will 
often write ci , C2 for ci A C2 . The fact that a constraint c is true for every variable 
valuation will be denoted by 2? ^ c. 

The Herbrand domain of logic programming is generalized to the constraint do- 
main 2? of CLP. Analogical generalization of non ground atoms and terms are 
constrained expressions. 

Definition 2.6 

A constrained expression (atom, term) is a pair cW E of a constraint c and an 
expression E such that each free variable of c occurs (freely) in E. 



^ Sometimes we slightly abuse the notation and use 2? to denote the carrier of T). 
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A cW E with some free variable of c not occurring in E will be treated as an abbre- 
viation for (3 . . . c) W E, where all variables of c not occurring in E are existentially 
quantified, 

Definition 2. 7 

A constrained expression c' [] E' is an instance of a constrained expression cW E 
if c' is satisfiable in V and there exists a substitution 6 such that E' = E9 and 
T) \= c' c6 [c9 means here applying 9 to the free variables of c, with a standard 
renaming of the non-free variables of c if a conflict arises). 

If c [] i5 is an instance of c' \\ E' and vice versa then c [] -E is a variant of c' WE'. 

By the instance- closure cl{E) of a constrained expression we mean the set of 
all instances of E. For a set S of constrained expressions, its instance-closure cl{S) 
is defined as Ubes cl{E). 

Note that, in particular, [| i?6' is an instance of c [] i? and that c' [| £' is an instance 
of c [| -E whenever V \= c' ^ c, provided that c9 and, respectively, c' are satisfiable. 
The relation of being an instance is transitive. (Take an instance c' \\ E9 oi c\\E 
and an instance c" |] E9a of c' [| i?6'. As 2? ^ c" ^ e'er and V \= c' ^ c9, we have 
2? 1= c" ^ c9a). Notice also that if c is not satisfiable then cW E does not have any 
instance (it is not an instance of itself). 

We will often not distinguish E from true W E and from cW E where V \= Vc. 
Similarly, we will also not distinguish cW E from c' W E when c and c' are equivalent 
constraints (P ^ c ^ c'). 

Example 2.8 

a + 7, Z + 7, 1+7 are instances of X + Y, but 8 is not. 

/(X)>3 W f{X)+7 is an instance of Z>3 |] Z+7, which is an instance oi Z + 7, 
provided that constraints f{X)>3 and Z>3, respectively, are satisfiable. 

Assume a numerical domain with the standard interpretation of symbols. Then 
4 -I- 7 is an instance of X=2+2 [| X+7 (but not vice versa), the latter is an instance 
of Z>3 Z+7. 



Consider CLP(FD) (CLP over finite domains, ( |Van Hcntenryck, 1989D ). A do- 
main variable with the domain S, where 5 is a finite set of natural numbers, can 
be represented by a constrained variable X£S W X (with the expected meaning of 
the constraint X&S). 



2.2.2 Extended Set Constraints 

We use a semantics for CLP which is based on constrained atoms/terms. To ap- 
proximate such semantics we generalize term grammars to describe instance-closed 
sets of constrained terms. In discussing grammars and the generated sets, we will 
not distinguish between predicate and function symbols, and between atoms and 
terms. 

For a given constraint domain D, we introduce some base sets of constrained 



terms. We require that base sets are instance-closed. Following (Dart & Zobel 



1992) we extend the alphabet of set constraints by base symbols interpreted as base 
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sets. Each base symbol b has a fixed corresponding set |6] of constrained terms, 
|6] 7^ 0. We require that the alphabet of base symbols is finite. We assume that 
there is a base symbol T for which |T] is the set of all constrained terms over given 
T). Usually no other base sets contain (constrained) terms with (non constant) 
function symbols. 

For instance in CLP over finite domains ( Van Hentcnryck, 1989| ), V contains 



terms built of symbols and integer numbers. The base sets we use for this domain 
are, apart from |T] , denoted by base symbols nat, neg, anyfd. They correspond to, 
respectively, the natural numbers, the negative integers and finite domain variables. 
The latter are represented as constrained variables of the form X^SWX, where 
5 is a finite set of natural numbers. Due to the closedness requirement, lanyfdj 
contains also the natural numbers. 

An extended set expression is an expression built out of variables, base sym- 
bols, function symbols (including constants), H and the generalized projection sym- 
bols. Extended set expressions are interpreted as instance-closed sets of constrained 
terms. In the context of extended set expressions, a valuation is a mapping assigning 
instance-closed sets of constrained terms to variables.^ 

The construction and generalized projection operation for (instance closed) sets 
of constrained terms are defined as 

f{Si, ...,Sn) = cZ({ci, . . . ,C„ Q f{ti, ...,tn) I Ci\\ti^ Si, i = 1, . . . ,n}) , 

t^^{S) = { c Q Xe* I c <6' e S", for some substitution 6* }, 

for instance-closed sets S, Si, . . . , Sn, a function (or predicate) symbol /, a term (or 
an atom) t and a variable X. Notice that f{Si, . . . , Sn),t~'^ (S) are instance-closed. 
A valuation, together with a fixed valuation of base symbols, extends in a natural 
way to extended set expressions. So if sets S'i,...,iS'„ are values of expressions 
ei, . . . , e„ then the value of /(ei, . . . , e„) is f{Si, . . . , Sn)- For a ground extended 
set expression t its value will be denoted by |t] . 

Extended set expressions can be used to construct set constraints and grammars. 
We consider extended set constraints of the form X > t, where A" is a variable and t 
an extended set expression. An extended term grammar is a set of constraints (often 
called rules) of the form X > t, where t is an atomic set expression (i.e. one built 
out of variables, the base symbols and the function symbols, including constants). 

A model of a set C of extended set constraints is a valuation /, under which 
I(X) 3 I{t) for each constraint X > t oi C. 

Proposition 2.9 

Any set C of extended set constraints has the least model. 



Notice that we have two different languages using variables; the language of set expressions (and 
of set constraints and grammars), with variables ranging over sets of constrained terms, and the 
language of constrained terms with variables ranging over a specific constraint domain. In this 
paper we use the same notation for both kinds of variables. This should cause no confusion, the 
kind of a variable is determined by the context. 
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Proof 

We show that the set of models of C is nonempty and that their greatest lower 
bound is a model of C. 

I assigning to each variable the set |T] of all constrained terms is a model of any 
extended set constraint. 

The greatest lower bound of a set I of valuations is a valuation f]T such that 
if]I){X) = n{ I{X) I / G J}, for any variable X. 

Let o be a construction operation, a generalized projection operation or n. Let 
k be its arity. For i = 1,. . . ,k, let Si be a set of instance closed sets of constrained 
terms. We have 

o(f|5i,...,f|>Sfe) c f]{o{Si,...,Sk)\SieSi,...,SkeSk}. 

(We do not need here to show equality). Hence for any extended set expression t 
and any set T of valuations 

if]im c f|{/(t)|/eJ}, 

by induction on the structure of t. Hence if each element of J is a model of an 
extended set constraint X > t then f]I is a, model of X > t, as {f]I){X) = 
nUiX) \I nUit) I ^ e X} D ir\I)it). Thus if I is the set of models of C 

then Pi X is a model of C, hence the least model. □ 

Definition 2.10 

The set defined by a variable X in an extended term grammar G is 

\X\g = {c[]M|c[|we|t|, X =^Q t and no variable occurs int} 
where the derivability relation is defined as for term grammars. 

Notice that we avoid confusion between the variables of grammars and the vari- 
ables of c;onstrained terms. The former occur in derivations, which end with ground 
terms built of function symbols (including constants) and of base symbols. The 
latter appear later on as a result of evaluation of base symbols in these ground 
terms. 

The notation {XJg is justified here by the following property. 
Proposition 2.11 

Let G be an extended term grammar and / the interpretation such that I{X) = 
{Xja for each variable X. Then I is the least model of G. 

Proof 

Consider a variable X and a constrained term c [| s e I-'^Jg- So there exists a deriva- 
tion X t such that c [] .s e |t| . By induction on the length of the derivation, for 
any model J of G, {tj C J{X). Thus I{X) C J{X). Hence I <J. □ 
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Definition 2.12 

An extended discriminative term grammar G is a finite set of rules of the form 

X>/(Xi,...,X„) or X>h 

where / is an n-ary function symbol (ri > 0) , X, Xi , . . . , Xn are variables and b is 
a base symbol. Additionally, for each pair of rules X > ti and X > t2 in G the 
sets Itjj and ftjj are disjoint (where stands for u with each occurrence of a 
variable replaced by T). 

So no two rules X > f{X), X > f{Y) may occur in such a grammar. The same 
foY X > b, X > b' where 6, b' are base symbols and |6] n 7^ 0. If a discriminative 
grammar contains X > f{X) and X > b then no (constrained term) with the main 
symbol / occurs in . If the grammar contains X > T then it is the only rule for 
X. 

The question is how to represent/approximate by such grammars the results of 
set operations for sets represented by such grammars, and how to check inclusion 
for such sets. We address these questions under some additional restrictions on base 
sets, which seem to be observed in base domains of CLP languages. We require that: 



Requirement 2.13 

• For any base symbol 6 different from T, /(-^-j^dfe]) = for every /, «. (So \b\ 
does not contain elements of the form c [] /(t), for any non constant /.) 

• For each pair 61, 62 of distinct base symbols the base sets \bi\ , I62] are either 
disjoint or one is a subset of the other. Moreover |6i] 7^ I62] . 

The number of base symbols is finite. Their interpretation is fixed. We can con- 
struct a table showing, for each pair 61, 62 of base symbols, whether n I62] = 0, 

IM^IM or IM^I^il- 



Now, the operations on grammars of Section 2.1.1 can be easily extended. Each 
of them traverses the rules in the argument grammars. Eventually we may reach 
a point when a base symbol is encountered instead of a constant. These cases are 
handled in a rather obvious way, using the table described above. Similarly as for 
discriminative term grammars, one obtains approximation of the union and exact 
intersection, generalized projection and construction. 



We postpone a formal presentation to Section 4.4, where we deal with a general- 
ization of grammars discussed here. 

Example 2.14 

Consider CLP(FD) ( |Van Hentenryck, 1989 ). The following discriminative extended 



grammars describe, respectively, integer lists and lists of finite domain variables 
(possibly instantiated to natural numbers): 

Li > nil Lfd > nil 

Li > con.s{Int, Li) Lfd > cons{A, Lfd) 

Int > nat A > anyfd 

Int > neg 
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Knowing that [natj C |anj//d] we can apply the intersection operation to obtain a 
grammar defining fLi ] n {Lfdj : 

Li ri Lfd > nil 

Li n Lfd > cons{Int HA, Lin Lfd) 
IntriA> nat 

The treatment of constraints by the formahsm of extended term grammars is 
rather rough. It stems from a small number of fixed base sets of constrained terms. 



They are subject to a rather restrictive Requirement 1.13, which is necessary to 
simplify operations on grammars. In our former work (Drabent & Pictrzak, 1998) 
we discussed a richer system of regular sets of constrained terms. It can be seen 
as also allowing base sets of the form d({c|]a;}), where the set of ground terms 
satisfying constraint c is regular. This results in substantially more complicated 
algorithms for grammar operations. According to our experience the simple type 
system presented in this paper seems sufficient. 



2.3 Constraint Logic Programming 

We consider CLP programs executed with the Prolog selection rule (LD-resolution) 
and using syntactic unification in the resolution steps. In CLP with syntactic uni- 
fication, function symbols occurring outside of constraints are treated as construc- 
tors. So, for instance in CLP over integers, the goal J3(4) fails with the program 
{p(2-|-2)<— } (but the goal p{X+Y) succeeds). Terms 4 and 2-1-2 are treated as not 
unifiable despite having the same numerical value. Also, a constraint may distin- 
guish such terms. For example in many constraints of CHIP, an argument may 
be a natural number (or a "domain variable" ) but not an arithmetical expression. 
Resolution based on syntactic unification is used in many CLP implementations, 



for instance in CHIP and in SICStus (|SICS, 199^ ) 



We are interested in calls and successes of program predicates in computations 
of the program. Both calls and successes are constrained atoms. A precise defini- 
tion is given below taking a natural generalization of LD-derivation as a model of 
computation. 

An LD-derivation is a sequence Go, Ci, 0i, Gi, . . . of goals, input clauses and 
mgu's (similarly to ( Lloyd, 1987] )). A goal is of the form c [] Ai, . . . , A„, where c is a 



constraint and Ai, . . . ,An are atomic formulae (including atomic constraints). For 
a goal Gi-i = cWAi, . . . ,An, where Ai is not a constraint, and a clause Ci ^ H ^ 
Bi, . . . , Bm, the next goal in the derivation is Gi = (c [] Bi, . . . , Bm, A2, . . . , An)Oi 
provided that 9i is an mgu of Ai and H, cOi is satisfiable and Gi_i and Ci do not 
have common variables. If Ai is a constraint then G; = c, Ai [| j42, . . . , A„ (9i = e 
and Ci is empty) provided that c, Ai is satisfiable. 

For a goal Gi_i as above we say that c [| ^1 is a call (of the derivation). The 
call succeeds in the first goal of the form Gk = c' W{A2, . . . ,An)p (where k > i, 
p = 9i ■ ■ ■ 9k) of the derivation. The success corresponding (in the derivation) to the 
call above is c' Q Aip. For example, Xejl, 2, 3, 4} []p(X, Y) and Xe{l, 2, 4} Q p{X, 7) 
is a possible pair of a call and a success for p defined by p{X, 7) ^ X ^ i. 
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Notice that in this terminology constraints succeed immediately. If A is a con- 
straint then the success of call c [| A is c, A [] A, provided c, A is satisfiable. So we do 
not treat constraints as delayed; we abstract from internal actions of the constraint 
solver. 

The call-success semantics of a program P, for a set of initial goals G, is a pair 
CS{P,G) — {C,S) of sets of constrained atoms: the set of calls and the set of 
successes that occur in the LD-derivations starting from goals in Q. We assume 
without loss of generality that the initial goals are atomic. 

So the call-success semantics describes precisely the calls and the successes in 
the considered class of computations of a given program. The question is whether 
this set includes "wrong" elements, unexpected by the user. To require a precise 
description of user expectations is usually not realistic. On the other hand, it may 
not be difficult to provide an approximate description Spec — (C, S') where C and 
S' are sets of constrained atoms such that every expected call is in C and every 
expected success is in S' . 

Definition 2.15 

A program P with the set of initial goals Q is partially correct w.r.t. Spec = (C, S') 
iff C C C" and S C S", where (C, S) = CS{P, Q) is the call-success semantics of P 
and G- 

P is partially correct w.r.t. Spec = (C, S') iff P with C" as the set of initial goals 
is partially correct w.r.t. Spec. 

We will usually omit the word "partially" . 

To avoid substantial technical difficulties, we will consider only specifications that 
are closed under instantiation. This means that whenever set C" (or S') contains a 
constrained atom c [| A then it contains all its instances. 

In Section ^ we introduce parametric specifications, discuss a more precise se- 
mantics and generalize accordingly the notion of program correctness. 

Our discussion of CLP semantics has been carried on under an assumption that 
the constraint solver is complete. Thus it is able to recognize all unsatisfiable con- 
straints. However actual solvers are usually incomplete. As a result, goals with 
unsatisfiable constraints may appear in derivations. But the set of solutions rep- 
resented by all answers of an incomplete solver is the same as the set of solutions 
represented by all answers of a complete solver. Thus, if our type checking technique 
indicates (possibility of) the existence of a wrong answer, beyond those character- 
ized by a specification, then this answer will also be obtained with an incomplete 
solver. Thus the assumption on completeness of the solver is only a technicality 
needed for formal development of the method, which is also applicable in the case 
of incomplete solvers. 

A specification describes calls and successes of all the predicates of a program, 
including the constraint predicates. As the semantics of constraints is fixed for a 
given programming language, their specification is fixed too. In our system it is 
kept in a system library and is not intended to be modified by the user. (The 
same happens for other built-in predicates of the language.) This fixed part of the 
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specification may not permit some constrained atoms as procedure calls; such calls 
are not allowed in the language and result in run-time errors.^ 

Example 2.16 

To illustrate the treatment of constraint predicates by specifications, assume that 
a CLP(FD) language has a constraint S, which describes membership in a finite 
domain. Assume that invoking S) with S not being a list of natural numbers 
is an error. This should be reflected by the specifications of all programs using G. 
In any such specification Spec = {Pre, Post), a call of the form c [| e{X,S) is in 
Pre iff S is such a list. If such a call succeeds, X must be a finite domain variable 
or a natural number. We may thus require that c [| €{X, S) is in Post iff S* is a list 
of natural numbers and c [| X is in |an?//(i] . 

The following definition provides a condition assuring that a specification cor- 
rectly approximates successes of constraint predicates. 

Definition 2.17 

We say that a specification [Pre, Post) respects constraints ii c, A\\ A £ Post when- 
ever c [| A G Pre and c, A is satisfiable (for any constraint c and atomic constraint 
A). This is equivalent to 

{ c, v4 [] I c, A is satisfiable } n Pre C Post 

as Pre is closed under instantiation. 



3 Partial correctness of programs 

In this section we present a verification condition for partial correctness of CLP 
programs. Then we express it by means of set constraints and show how to perform 
correctness checking and how to compute a specification approximating the call- 
success semantics of a program. 



3.1 Verification condition 



A sufficient condition for such correctness of logic programs was given in ( Drabent 
|fc Maluszynski, 1988 | )^ For specifications which are closed under substitution the 
condition is simpler ( Bossi fc Cocco, 1989|) , ( Apt, 1997 ). Generalizing the latter for 
constraint logic programs we obtain: 

Proposition 3.1 

Let P be a CLP program, Q a set of initial goals and Spec = {Pre, Post) be 
a specification respecting constraints and such that Pre, Post are closed under 
instantiation. 

A sufficient condition for P with Q being correct w.r.t. Spec is: 



* An exact description of the set of allowed calls of constraints is sometimes impossible in our 
framework, as the set may be not instance closed. For example, many constraints of CHIP have 
to be called with certain arguments being variables. 
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1. For each clause H ^ Bi, . . . , Bn of P, j = 0, . . . , 71, any substitution 6 and 
any constraint c 

if cWHe <E Pre, cWBid e Post, c Q 5^0 £ Post 
then c W Bj^id G Pre for j < n 
cWH9 e Post for j = n 

2. g C Pre 
Proof 

Follows from more general Theorem 5^ applied to a specification set {{Pre, Pre H 
Post)}. □ 

For simplicity we consider here only atomic initial goals. Generalization for non 
atomic ones is not difficult. For instance one may replace a goal c [| A by goal p 
and an additional clause p <— c, A in the program, where p is a new predicate 
symbol. Alternatively, one can provide a condition for goals similar to that for 



clauses (Drabent fc Maluszynski, 1988), (Apt, 1997) 



Notice that the constraints in the clause are treated in the same way as other 
atomic formulae. As constraint predicates are not defined by program clauses, the 
requirement that the specification respects constraints is needed in the proposition. 

The part of the specification concerning constraint predicates is fixed for a given 
CLP language. As already mentioned, in our system it is kept in a system library. 
It is the responsibility of the librarian to assure that the library specification re- 
spects constraints. This property depends on the constraint domain in question, 
and therefore no universal tool can be provided. The number of constraint predi- 
cates in any CLP language is finite, so is the library specification, which has only 
once to be proved to respect constraints. 



We want to represent Proposition 3.1 as a system of set constraints. Each impli- 
cation for a clause C — H-^Bi , . . . , Bn from condition 1 of the proposition can now 
be expressed by a system Fj{C) ~ Fj^i{C) U Fj_2{C) of constraints, where Fj^iiC) 
consists of 

j 

X > H-^{Call) n ^B,-^ {Success) (1) 
i=i 

for each variable X occurring in the program clause and i^j,2(C) contains one con- 
straint 

Call > Bj+i if j < n , , 

Success > H ii j ^ n 
(The program variables occurring in the clause become variables of set constraints. 



As explained in Section 2.2.2 , the predicate symbols are treated as function sym- 
bols.) 

This constraint system has the following property. 
Lemma 3.2 

Let C = H ^ Bi, . . . , Bn be a clause and Spec = {Pre, Post) a specification. If 
constraint set Fj (C) has a model assigning to Call the set Pre and to Success the 



set Post then implication of Proposition 3.1 holds, for any 9 and c 
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Proof 

Assume that / is such a model. From (|l|) it follows that c [] X9 G I{X) for each c, 9 
satisfying the premise of the implication and for each variable X in the clause. Now 
from (|) it follows that c [] Bj+i0 e I[B.j+i) C Pre, respectively c Q i/e* G I{H) C 
Post when j — n. □ 

Set constraints Fj{C) express a sufficient condition for program correctness. If 
a specification is given, to check the correctness it suffices to check whether the 
specification extends to a model of Fj{C) (for all C G P and j). In the sequel we 
show how to do this effectively for the case when Pre and Post are defined by 
discriminative extended term grammars. 



If a specification is not given, Lemma 3.2 tells us that the program is correct with 



respect to the specification obtained from any model of Fj{C) (for all C and j). An 
algorithm for constructing a discriminative term grammar describing a model of 
the constraints could thus be seen as a type inference algorithm for this program. 



3.2 Correctness checking 

In this section we present an algorithm for checking program correctness. We will 
consider specifications given by means of extended term grammars. Such a gram- 
mar G has distinguished variables Call, Success and the specification is Spec — 
(\Call\G, \Success\G) (so Pre — \Call\G, Post = \Success\G)- We require that 
the variables of G are distinct from those occurring in the program. We also require 
that Spec respects constraints. So such grammar can be seen as consisting of two 
parts: a fixed part describing the constraints and built-in predicates, and a part 
provided by the user. 

Example 3.3 

The specification of constraint predicate G from Example 2.16| can be given by the 
following grammar rules. 

Call > e{Any,Nlist) Success > e{Anyfd, Nlist) 

Nlist > [] Anyfd > anyfd 

Mist > cons{Nat, Nlist) Nat > nat 

Consider an atom B — g(X, [/, J]). Applying the generalized projection operation 
one can compute that B^-^ (ISuccessJ) — lanyfdj and B^'' (ISuccessJ) — InatJ. 

Notice that within the formalism of extended term grammars we cannot provide a 
more precise specification. For instance we cannot express the fact that if c [] G(ti, 
is a success then c constraints the value of ti to the numbers that occur in the list 
t2 (formally: any ground element of cl{{c W ti}) is a member of t2). 

Our algorithm employs the inclusion check, intersection and generalized pro- 
jection operations for extended term grammars. As already mentioned, they are 
rather natural generalizations of the operations for term grammars described in 
Section ^.l.l| . The details can be found in Section [4.4| , describing operations for 
parametric extended term grammars. 
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The algorithm resembles a single iteration of the iterative algorithm of (Gallagher 



& de Waal, 1994) for approximating logic program semantics, in its version with 
"magic transformation" . However it works on extended term grammars. We provide 
its detailed description combined with a proof of its correctness, in order to facilitate 
a further generalization to parametric case. 

As explained in the previous section, a sufficient condition for a program P to be 
correct w.r.t. Spec is that for each n-ary clause C of P and for each j = 0, . . . , n, 
constraints Fj(C) have a model that coincides on Call and Success with the least 
model of G. 

To find such a model we construct (a grammar describing) the least model of 
Fj^i{C) U G. Then we check if it is a model of Fj^2{C). If yes then it is the required 
model of Fj{C). Otherwise we show that the required model does not exist. 

The first step is to compute the projections and intersections of (0). To each ex- 
pression of the form A^-^ (Y) occurring in (|l|) we apply the generalized projection 
operation to construct a grammar Ga defining A~^(|y]G). Then we apply the in- 
tersection algorithm to grammars Gh, C?Bj , . . . , Gb^ • As a result (after appropriate 
renaming of the variables of the resulted grammar) we obtain a grammar Gx such 
that 

j 

iXja^ ^H-^'ilCallja) n f] Br"" (ISuccessja)- 

1=1 

and all the variables of Gx, except of X, are distinct from those of FjiC) U G. 
Obviously, |^]gx is the same as |X] in the least model of {(§} U G. 

The first step is to be applied to each constraint (1) of Fj{G) (with a requirement 
that the variables of the constructed grammars Gx are distinct). Let G' = IJx 
be the union of the grammars constructed in the first step. We combine G' and G, 
where the roles of G', G are to define values for, respectively, the variables of G and 
variables Call, Success. The least model of GUG' is a model of Fj.i(G) UG (and it 
coincides with the least model of Fj.i(G) U G on Vars{G) U {Call, Success}, where 
Vars{C) is the set of the variables occurring in G). 

The second step is transforming (|^) to a discriminative grammar G", by applying 
repetitively the construction operation. Let us represent constraint (||) as F > A 
(so Y is Call or Success and A is Bj+i or H). For each subterm s of A, G" employs 
a variable Xs- Xa is Y and if the given subterm s is a variable V then Xy is V. 
Otherwise Xg is a new variable, not occurring in G, G, G'. Grammar G" contains 
the rule X^ > f{Xsi, ■ ■ ■ , X^^) for each non variable subterm s — /(si, . . . , s„) 
of A. We have |Xs]g'ug" — Wc, for each subterm s. In particular |F]g'uG" = 
|A]g' = MguG'. 

This completes the construction. We may say that Fj{C) was transformed into 
a discriminative grammar Fcj — G' U G" . 

It remains to check whether |F]g'uG" ^ |F]g- If yes then |A]guG' ^ |F]guG', 
i.e. the least model of GUG' is a model oi A <Y. Thus it is the model of Fj(G) UG 
required in Lemma 3.4. 

Otherwise, notice first that if Fi C F2 then C [Xji^^, for constraint 

sets Fi,F2. So we have I^Ig-uG" = MguG' = Mi=^,,i(c)uG ^ Mf,(C)ug Q 
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nF,(C)uG- Thus irjcuG" % lYja implies lylF,(c)uG 2 Hence I{Y) % 

\V\g for any model / of Fj{C) U G and the required model of Fj{C) U G does not 
exist. 

Thus we obtained: 

Lemma 3.4 



The implication from Proposition holds for a clause C and a number j if 



I^^Ig'uG" C for grammars G',G" constructed as above. 

The inclusion can be checked by applying the inclusion algorithm (preceded by 
removing nuUable symbols). 

We now estimate the complexity of the algorithm. The cost of the intersection 
operation applied to two grammars with respectively vi,V2 variables is 0{viV2)- 



The cost of removing unliable symbols is linear (Hopcroft et at, 2001). 

Let us now consider the inclusion check. We may assume that grammars are 
stored so that the productions for each variable are kept together and ordered. 
Let vi, V2 be the numbers of variables in the grammars. For each encountered pair 
X, Y of variables, it has to be checked whether the pair has not occurred previously 
(0(log(uiU2))) and the productions for X and for Y are to be found (0(log(t;i) + 
log(u2))). The pairs of productions with the same function symbol can be found in 
time proportional to the number of function symbols occurring in the productions 
found. For each pair of productions X > f {...), Y > /(...) new variable pairs 
are generated, their number is the arity of /. Taking as constants the maximal 
arity and the maximal number of function symbols in the productions for a given 
variable, we obtain 0(log(wi'i;2)) per pair. So the total cost of inclusion check is 
0{viV2log{viV2))- This cost is not changed when the costs of initial sorting of the 
grammars are taken into account. 

Notice that in our algorithm the results of all the generalized projections and 
intersections computed in the step for j can be reused in the next steps. Taking 
into account the intersections needed to compute the projections, there are k — 1 
intersections to be computed for each variable occurring k times in the clause G. 
The cost of computing such a /c-fold intersection and the size of resulting grammar 
is 0{v'^~^), where v is the number of variables in the specification grammar G. 

Computing mappings ^ in the projections and constructing all the G" is linear in 
the size of the clause. Inclusion checking for a pair of grammars with respectively 
0{v''~^) and v variables can be done in time 0{v'' log(v'')) = 0{c^), where constant 
c depends on the number of variables in the grammar. 

Thus the correctness checking algorithm described in this section works in time 
0{c^), where k is the maximal number of occurrences of a variable in a clause. 

Example 3.5 
Consider the program 

app([] ,V,V) . 

app([A|X] ,Y, [AlZ]) :- app(X,Y,Z). 

The verification conditions can be expressed as three constraint systems (we abbre- 
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viate H = app{[A\X],Y, [A\Z]), B = app{X, Y, Z)): 

V > app{[],V,V)-^{Call) 
Success > app{[], V, V) 

A > H-^{Call) 
X > H-^{Call) 
Y > H-^iCall) 
Z > H-^{Call) 
Call > app{X, Y, Z) 

A > H-^{Call) n B-^{Success) 
X > H-^{Call) n B-^{Success) 

Y>H-^{Call) n B-^ {Success) (3) 
Z>H-^{Call) n B-^{Success) 
Success > H 

Let the following extended term grammar G provide a specification. 

Call > app{L, L, Any) 

Success > app{L, L, L) 

L>[] 

L > [M\L] 

Any > T 

where M is further specified by grammar rules not presented here. We assume that 
M is not nuUable in G. 

Using the described techniques one can check that the specification defines a 
model for all above stated set constraint systems. For example we check the con- 
straints (||). To compute the projections related to atom H — app{[A\X],Y, [A\Z]) 
and Call we first obtain the following mapping between the subterm occurrences 
in H and the variables of G. 

V[A\x] Vai^M 
V[A\z] = Any Va2 = Any 

Vx = Vy^L 

Vz = Any 

Similarly, for the projections related to atom B = app{X, Y, Z) and Success, we 
have 

Vx ^Vy^Vz^L 

The grammar describing H^^{Call) is G h G with a distinguished variable M h Any. 
The clauses of GfiG for Mr]Any are {Mr]Any > t \ M > t e G}. (Also 
G C GnG.) M n Any is not nuUable in G h G, as M is not nuUable in G. 

Notice that B~^{Call) = |T] (as A does not occur in B). All the other projec- 
tions from (^ are given by variable L or Any and grammar G. 
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Now we construct grammar G" for which 

IAJg' ^iMnAnyjcnGnlTj 
iXjc = [£1g n iLjc 
lYjc - iLjc n iLjc 
IZ]g' = lAnyja n iLja 

Computing intersections (and renaming variables where necessary) results in a 
grammar G" consisting of the rules 

{A>t\M>teG} U {X>[1 Y>[], Z>[], 

X > [M\X], Y > [M\Y], Z > [M\L] } 

and the rules of G except for those for Call, Success. (Before constructing the 
grammar we simplified |Af fi Any] g p ^ n |T] to |M fi Any] g ^ g and {Ljc H {Ljc 
to I^Ig. Formally, G" has variables distinct from those of G.) Variables A,X,Y, Z 
are not nuUable in G' . 

The least model of G' provides a valuation for variables A, X, Y, Z. It remains to 
check that for this valuation, together with the valuation for Success given by the 
specification G, the constraint Success > app{[A\X],Y, [A\Z]) holds. To do this we 
transform this constraint into a discriminative grammar G": 

Success > app{Xi, Y, X2) 
Xi > [A\X] 
X2 > [A\Z] 

and apply the set inclusion algorithm to check whether the set defined by Success in 
the specification grammar G is a superset of that defined by Success in the obtained 
grammar G'UG". The check succeeds. Hence there exists a model for the considered 
five constraints which agrees on variables Call and Success with the model given 
by the specification. Notice that this holds independently of the missing fragment 
of G defining M. 

The same procedure can be performed for all the constraint systems generated 
for the given program, hence confirming that the program is correct w.r.t. the 
parametric specification. Also in these cases the correctness check is independent 
from {M)g (the part of G defining M). 

In our example the correctness check was independent from a subset {M)q of 
the specification grammar G. This is not uncommon, for some programs and spec- 
ification grammars a correctness check refers only to some rules of the grammar. 
Thus a single check is valid for a whole family of grammars. This phenomenon will 
be exploited in our approach to parametric specifications. 



3.3 Approximating program semantics 

In this work we are mainly interested in checking program correctness. However the 
representation of the verification condition (Proposition B. 1 ) as constraints (Lemma 
|3.2[ ) can be used to obtain an approximation of the semantics of a given program 
P. In the previous section we showed how a single implication from Proposition 
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|3.l| can be expressed by a constraint system Fj{C). We begin with constructing a 
constraint system representing all the implications from the proposition. 

Let us consider the constraints Fj{C) {j = 1, . . . ,nc) for each clause C of P 
with nc body atoms. Let Pj{C) be Fj{C) with the variables renamed in such 
a way that the only common variables of (distinct) Fj^(Ci), F'^^{C2) are Call 
and Success. Let grammar Gq specify the initial goals and the constraint pred- 
icates. So ICa^ZJco is the set of initial goals and of the allowed calls of constraints. 
\SuccessjGo is (a superset of) the set of possible successes of constraint predicates.^ 
Thus {\Call\Go-: \Success\Q^^) respects constraints. 

Now any model / of the constraint system 

U U^'(^) U Go 

gives a specification Spec = {I (Call), I {Success)) with respect to which P is correct, 



provided that Spec respects constraints. This follows immediately from Lemma 3.2 



In the special case of logic programs a model of C{P) can be found by using 
the techniques for set constraint solving. For example the technique of Heintze 



and Jaffar (1990a; 1991) produces a (non-discriminative) term grammar specifying 
the least model of set constraints. This technique has been used for generating 



approximations of logic program semantics (Heintze & Jaffar, 1990b; Heintze, 1992 
Heintze & Jaffar, 1994; Charatonik & Podclski, 1998). Another constraint solving 



approach that uses tree automata techniques, has been presented in (Deviennc et 



1997a; Talbot et al, 200C). We expect that these techniques can be generalized to 



the case of CLP programs, but we did not investigate this issue yet. 

Yet another approach to finding a model of the constraint system C{P) stems 



from abstract interpretation techniques (among others (Janssens & Bruynooghe, 



1992; Van Hcntcnryck et al, 1995), (Gallagher & dc Waal, 1994), we generalize the 



latter work in (Drabent & Pietrzak, 199£; Drabent et al., 20001:; Drabent et 



2000a) and here). C{P) is seen as a valuation transformer, its fixed points are models 



of C{P). Valuations are represented as discriminative grammars. A fixed point is 
computed iteratively. 

To augment our system with a tool for computing approximations of program 
semantics, we provide a solution based on the latter idea. This choice was guided 
mainly by possibility of reusing our correctness checking algorithm and the imple- 
mentation of ( Gallagher fc de Waal, 1994 ). 

The correctness checking algorithm of the previous section can be easily modified 
to compute the valuation transformer related to C{P). This gives an implementation 
of a single step of the iteration. It remains to combine it with some technique of 
assuring termination. 

Iteration step. Take Gi (initially Go). To each F'^iC) UG^ apply the construction 



This approach can also be used when P is a fragment of a program, i.e. the clauses defining some 
predicates are missing in P. Then the semantics of such predicates has to be specified by Go. The 
algorithm treats them as the constraint predicates. Examples of such program fragments are 
programs using built-in predicates, unfinished programs or modules of some bigger programs. 
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as in the correctness checking, obtaining a discriminative grammar Fcj- (It is 
required that all the obtained grammars have distinct variables, except Call and 
Success). For each Fcj, the variables occm'ring in Fcj are distinct from those in 
Gi except for Call or Success. 

The constraints of Fj{C) are satisfied if the occurrences of Call, Success in the 
right hand side of each constraint of the form (|^) (Section 3^) are valuated as in 
the least model of Gi, and the remaining variable occurrences as in the least model 
of Fcj. This follows from the discussion in the previous section. 

The obtained grammar G'^ = Gi U UceP Uj ^c,j is not discriminative, due to 
the rules for Call and for Success. Construct a discriminative approximation of 
G[, more precisely a discriminative grammar G^+i such that ICanjc/ C |Ca/Z]G._^j 
and the same for Success. This is done by applying the union operation of Sec- 
tion 2.1.1 to Gi and all grammars Fcj. (So Gi+i is Gi U IJ^^plJ^Fcj with the 



variable CallU . . . UCall renamed into Call and SuccessL) . . . OSuccess renamed into 
Success.) 

The obtained grammar Gi+i has the following property. C{P) — Go is true when 
Call and Success in all the constraints of the form (^) (Section 3.1) are valuated 
as in the least model of Gi, Call and Success in the constraints of the form (^ 
(Section as in the least model of Gi+i, and the (renamed) variables of P as in 
the least model of G^. 

It remains to check whether the specification given by Gi+i does not contain 
incorrect calls of constraint predicates. This boils down to checking whether all the 
calls of constraint predicates from the set ICalljoi+i are also members of IGa/ZJc^. 
The latter is equivalent to IGaHJi? C ICalljoo, where F = Gi+i — { Call>A \ 
A is not a constraint }. Failure of the check means that we are unable to construct 
a specification which respects constraints. This suggests a program error and an 
appropriate warning is issued. 

This completes an iteration step. Notice that the calls and successes of constraint 
predicates specified by Gi+i are the same as those specified by d and thus by 
Go (induction on i). For calls it follows from succeeding of the checks above. For 
successes we have that any clause Success > p{X) from Gi+i, where p is a constraint 
predicate, occurs also in Gi. 

The iteration is terminated if a fixpoint is reached, this means when |Ga^/]G._|_j C 
|Ga/Z]Gi and [SuccessJd+i Q ISuccessJd- (The inclusion in the other direction 
holds for each i). The required model of C{P) is a valuation in which the values of 
the variables from Go, except for Call and Success, are as in the least model of Gq, 
the values of Call, Success are as in the least model of Gi, and the variables of P 
are valuated by the least model of G ■ . 

As a result we obtain that whenever the iteration terminates, program P is correct 
w.r.t. the specification given by the obtained grammar Gi. 

Notice that this is justified in a different way than usually done in abstract 
interpretation. Instead of relating a single iteration step to the concrete semantics 
of the program, we showed that the obtained fixpoint satisfies a sufficient condition 
for program correctness. 
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Termination. Usually the iterative process described above does not terminate. 
It should be augmented with means of assuring termination. The idea is to apply 
a restriction operator TZ that maps an infinite domain of grammars to its finite 
subset. Moreover, the operator TZ computes an approximation of a grammar G (i.e. 
|Ca/Z]G C |Ca//]7j,((3) and [Successjc Q ISuccessj-ji^Q-j) . The operator is applied in 
every iteration step: the newly obtained grammar G^+i is replaced by a grammar 
Hi^i = TZ(Gi+i). In this way we obtain a sequence of grammars Gq, Hi, H2, ■ ■ ., 
the sequence has the properties described in the previous paragraphs. Since the 
co-domain of TZ is finite, the set of grammars { Go, Hi, H2, . . . } is finite and the 



iteration terminates. This technique can be seen as an instance of widening (Cousot 



Cousot, 19921) . 



An attempt at such approach was made by Gallagher and de Waal (1994). Un- 
fortunately, the termination proof given by the authors is erroneous and Mildner 
(|199§) showed an artificial example which results in an infinite loop. 

We adapt a technique presented in (Mildner, 1999), Section 6.5, and inspired 
by ( [Janssens fc Bruynooghe, 1992 ). We describe it briefly. Let the principal la- 
bel of a variable X be the set of function symbols occurring in the right hand 
sides of the rules defining X in a given grammar G. Let a term grammar graph 
be a directed graph with grammar variables as vertices. An edge {X, Y) belongs 
to the graph iff there is a rule X > f{...,Y,...) in the grammar. The opera- 
tor TZ computes an approximation of a grammar G ( |CaZ/]G C lCallli^(^Q-^ and 
ISuccessjc Q ISuccessl-ji(c)) assuring at the same time that there is a spanning 
tree of the graph of TZ{G) such that each branch of the tree contains no more than 
k variables with the same principal label. Since the grammar is discriminative, and 
since there is a finite number of function symbols in a program, the set of such span- 
ning trees (modulo variable renaming), is finite and consequently the co-domain of 
TZ (modulo variable renaming) is finite. We usually apply k — 1. 

The reasoning above does not provide any useful estimation of the complexity 
of the algorithm. Our experience shows that it is sufficiently efficient to compute 
directional types of medium size programs. 

There exist variants of this method, taking into account a number of occurrences 
of a single function symbol along a path or just simply binding a depth of the 
spanning tree with a constant. 

Another possibility to cope with the termination problem is to restrict the class 
of grammars so that the class of defined sets is a partial order of finite heightd^. 



® For example Boye (1996) sug{ 
this means 



;ested that the inference is always done with a finite lattice of 
that for a class of applications we may have a finite library 

This will also facilitate 



types. In practice 

of types, represented by grammars, which may be extended by need 
communication with the user who will easier understand standard application-specific types 
than the types represented by automatically generated grammars. 
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4 Parametric Set Constraints 
4 . 1 Motivation 



In Example 3.5, the correctness checking of the program was done without referring 
to a missing fragment {M)c of the grammar that provided the specification. This 
was due to the fact that the constraints did not include generalized projections of 
{M, G) and all intersections involving M were of the form MflM or MnAny, where 
Any is defined by clause Any > T. The meaning of such expressions is preserved if 
we simplify them to M . As a result we obtained a term grammar referring to M. 
The obtained solution is parametric in the sense that it will hold for any specific 
choice of the missing fragment of the grammar. Thus the example demonstrates 
parametric polymorphism of append, where calls and successes are approximated 
by sets determined by the same specific M . This kind of parametric polymorphism 
is useful in locating program errors (cf. the examples in Section In the rest of 
this section we extend previously introduced basic concepts to be able to handle 
parameters. 



4-2 Syntax and Semantics 
To define a notion of a parametric set constraint we extend the alphabet. In addition 



to the symbols discussed in Section 2.1 we assume that the alphabet also includes 
parameters disjoint with the other categories of symbols. Parameters will be denoted 
by Greek letters a,/?, .... A parametric set expression is a parameter, a variable, a 
constant, or it has a form /(ei, e„), t~'^ {e) or eine2, where / is an n-ary function 
symbol, i is a term, X a variable and e,ei,...,e„ are parametric set expressions. 
Notice, that this definition extends the usual definition of set expressions, so that a 
usual set expression without parameters becomes a special case of a parametric set 
expression. A parametric term expression is atomic if it does not include projection 
and intersection symbols. 

For a given valuation of the variables, a parametric set expression denotes a func- 
tion from valuations of parameters to subsets of the Herbrand universe. The value 
of the function for a specific valuation of parameters is determined by considering 
parameters to be additional variables of the set expression. 

We will consider parametric set constraints of the form 

Variable > Parametric set expression. 

As discussed above, a collection of non-parametric set constraints has the least 
model which can be defined by a term grammar. A similar property holds in the 
parametric case. Take a collection C of parametric set constraints and treat the 
parameters as variables. For any given fixed valuation / of the parameters there 
exists the least model out of the models of C coinciding with / on the parameters. 
(This can be proved similarly as Proposition |2.9| ) . 

In order to deal with sets of constrained terms parametric set expressions can be 
generalized to parametric extended set expressions. This is done by permitting base 
symbols to appear in the expressions. Parametric extended set expressions give rise 
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to parametric extended set constraints. For any fixed valuation of parameters, a 



collection of such constraints has the least model. (Proof as in Proposition 2.9). 



4-3 Parametric Term Grammars 

Our parametric specifications will be expressed by parametric grammars. We first 
introduce parametric term grammars and a notion of an instance of such a grammar. 
Such instances define sets of terms. Then we extend this approach to define sets of 
constrained terms. 

Definition 4-1 

A parametric term grammar G is a finite collection of parametric set constraints of 
the form X > t where X is a variable and t is an atomic parametric set expression. 



For instance we can consider the grammar G of Example 3.5 as a parametric 
grammar with one parameter M. 

In the context of parametric grammars, a (parametric) set descriptor is a pair 
{X, G) where G is a parametric grammar and X a variable or a parameter. The 
derivability relation is defined in the same way as for non-parametric term gram- 
mars. Notice, however, that the normal forms may include parameters. 

Parameterless grammars are used to define sets, the role of parametric grammars 
is to define mappings on sets. This is done by assigning sets to the parameters of a 
grammar. The sets are given by some other grammars. 

Let G be a parametric grammar such that ai, . . . , afc are all parameters occur- 
ring in G. Sometimes we will denote it G{a) where a — (ai, . . . , au)- A function 
$ that maps each parameter ai of G into a set descriptor (Xj, Gi) is called, abus- 
ing the standard terminology, a parameter valuation for G. For a given a we will 
sometimes represent a $ = { ai ^ (Xi, Gi) , . . . , afc ^ {Xk,Gk) } as the vector 
{{X,,Gi),...,{Xk,Gk}). 

Definition 4-- 2 

Let G be a parametric term grammar and let <f> = {ai ^ {Xi,Gi) , . . . ,ak i— > 
{Xk, Gk))} be a parameter valuation. 

An instance of G under $ is the parametric grammar G(<1') — G' U G[ U . . . U G'/., 
where 

• {XI, G'j) are obtained by renaming apart all variables in each (X^, Gi) so that 
the grammar G and descriptors (X{, G'^) , . . . , (X^, GJ^} have pairwise disjoint 
sets of variables. 

• G' is obtained by replacing each parameter ai in G by X[. 

If G($) contains no parameters then the usual notion of the sets defined by 
a grammar applies to G($).[[| For each its variable X it defines a set, which is 
|X](3($-). So a parametric grammar G(ai, . . . ,ak) defines a mapping from the 
sets corresponding to descriptors (Xi, Gi), . . . , (X^, Gk) to the sets defined by the 

It applies also to any parametric grammar H and to each variable X such that {X)h is 
parameterless. 



Using parametric set constraints for locating errors in CLP programs 31 

grammar G'(<E>). Moreover, G($) defines the value for each parameter ai of G: 

IazlG(<i>) = K]g(*)- 

The definition of an instance generaUzes in an obvious way from parametric 
grammars to sets of (extended) parametric set constraints. 

Definition 4-3 

A parametric term grammar is discriminative if 

• each right hand side of a rule is of the form f{Xi, . . . , X„) where each Xi is 
a variable or a parameter. 

• for a given variable X and given n-ary (n > 0) function symbol / there is at 

most one rule of the form X > /(...) 

Notice that the instance of a discriminative grammar under a parameter valuation 
over discriminative grammars is discriminative. 

Example 4-4 

Let grammar G{a) be 

List > nil List > cons{a, List) 

This grammar is discriminative. Consider $ = { a i— > {List, G) }. Since $ shares 
variables with G we rename it apart to obtain {Listl,G') , where G' is: 

Listl > nil Listl > cons{a, Listl) 

(The parameters are not renamed, since they are not variables). G($) is 

List > nil List! > nil 

List > cons{Listl, List) Listl > cons {a, Listl) 

We will use the following notation, when it does not lead to ambiguity. Let G 
be a discriminative parametric grammar, X a variable and a = {ai, . . . ,ak) the 
parameters occurring in G. By the (parametric) type X{a) we mean the family of 
sets defined by X in G (more precisely the mapping from parameter valuations 
to sets, assigning to $). In the special case of a parameterless grammar 

G, type X is the set {Xja- Let $ = {aii-^{Xi,Gi) , . . . ,ak>-^{Xk,Gk)} be a 
parameter valuation, where the grammars arc discriminative and the parameters 
occurring in Gi are cfi, for i = 1, . . . , k. Then by type X(Xi(ai), . . . , Xi~[ak)) we 
mean the family of sets defined by X in grammar G($). 

For instance the mapping corresponding to variable List in grammar G{a) of the 
last example can be called List{a). The mapping corresponding to List in G($) 
can be called List{List{a)). 

Instances of parametric discriminative term grammars define sets of terms. Sim- 
ilarly as in the non parametric case, we generalize this formalism to specify sets of 
constrained terms. Assume a fixed constraint domain V. 

Definition 4-5 

A discriminative parametric extended term grammar (FED grammar) G is a finite 
set of rules of the form 



X>/(Xi,...,X„) or X>b 
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where / is an n-ary function symbol (n > 0), X is a variable, Xi, . . . , X„ are 
variables or parameters and & is a base symbol. Additionally, for each pair of rules 
X > ti and X > t2 m G the sets ftj] and are disjoint (where stands for 
u with each occurrence of a variable or a parameter replaced by T). 

The definition of an instance of a grammar applies to parametric extended gram- 
mars too. A parameterless instance of such grammar defines a set of constrained 
atoms for each variable, as described in Section ^.2.2 . 

Example 4-. 6 

Take the grammar G{a) from the previous example. Using $ = { a ^ 
{Any , {Any>T})} we obtain G($) defining lists of arbitrary constrained terms. 
Formally, {List, G($)) defines the set { c . . . , i„] | n > 0, U are terms } (as any 
term of the form [T, . . . , T] can be generated from List in grammar G(<i?). 



4-4 Operations on extended parametric term grammars 



We now extend the operations of Section 2.1.2 to extended parametric discrim- 
inative term grammars. For each of them we show how the resulting grammar 
approximates a relevant set operation for each parameterless instance of the argu- 
ments. 



4-4.1 Emptiness Check and Construction 

A variable X in a FED grammar G will be called nullahle if no variable-free term 
(i.e. a term consisting entirely of function symbols, base symbols and parameters) 
can be derived from X in G. So for a nuUable X, |X]g(<I') = independently 
from Similarly as in non parametric case, algorithms for finding nuUable symbols 
in context-free grammars can be applied here. Notice that for a non nuUable X there 
exists a $ such that |X]g($) (provided that the grammar does not contain a 
base symbol 6, for which |6] — 0). 

The construction operation extends naturally to parametric grammars. Let 
{Xi,Gi), . . . , {Xn,Gn) be set descriptors with pairwise disjoint variables and let / 
be an n-ary function symbol. By f{{Xi,Gi), . . . , (X„, G„}) we denote set descriptor 
{Y, G) , where y is a new variable and 

G = {Y> f{Xi, . . . , X„) } U Gi U . . . U G„ 

(When the set descriptors have some common variables then 
f{{Xi,Gi),...,{Xn,Gn)) can be defined by renaming apart the variables in 
the descriptors). Clearly: 

Proposition 4- 7 

For any parameter valuation <i> the set descriptors /((Xi, Gi), . . . , (A"„, G„))("I>) 
and /((Xi, Gi($)), . . . , {Xn, G„($))) are identical (up to renaming of the variables 
introduced while building the grammar instances and of the variable introduced by 
the construction operation). 
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If Gi(<i>), . . . , Gn{^) do not contain parameters then 

/(I-'^i]gi(<I>), • • ■ , I-'^nlG„(<I.)) = [^l/((A'i,Gi),...,(X„,G„))(<C) 

4.4-^ Intersection 

Let Gi and G2 be PED grammars. We assume without loss of generahty that they 
have no common variables, but they may have common parameters. We define 
an operation h on such grammars; the result is a PED grammar GinG2. The 
variables of GihG2 include the variables of Gi , the variables of G2 and new variables 
corresponding to pairs {X, Y) where X is a variable of Gi and F is a variable of 
G2. The latter wiU be denoted XnY. 

We define GihG2 to consist of the rules of Gi, those of G2 and for each X > s E 
Gi and Y > t E G2 at most one rule as described below. 

• XnY > /(si o ti, . . . , Sn o tn) (n > 0), provided that s = /(si, . . . , s„), 
t — f{ti, . . . , tn) and Si o ti is the following symbol: 

1. it is the variable Sih^i, if Si and ti are variables, 

2. it is Si, if Si and ti are parameters, 

3. it is the variable Y, if one of the terms Si.ti is Y and the other is a 
parameter. 

• XrY > u, provided that at least one of s, f is a base symbol and the following 
holds. Let us denote {si, §2} — {s, i} where si is a base symbol. Now 

— si = T and ?i = S2, or S2 = T and u = si, or 

— S2 is a constant c G |sij and u is c, or 

— S2 is a base symbol, |sij C [52] and u = si, or |s2] C \si\ and w = S2.0 

Some decisions in this construction are arbitrary. Instead of choosing Si o ti to be 
Si when both Sj, ti are parameters, one may choose ti. For the case of Si, ti being a 
parameter and a variable one may choose Si o ti to be the parameter. In the latter 
case we expect that our choice gives more useful results when further operations 
are applied to GihG2, as a variable corresponds to a known set of rules while a 
parameter does not. 

We notice that by construction GinG2 is a PED grammar and all its parameters 
(if any) appear in Gi or in G2. The construction guarantees also the following 
property. 

Proposition 4-8 

For every parameter valuation <I> such that Gi(<I>) and G2($) are parameterless 
grammars we have 

{GinG2)(.<S>) 

for all variables X in Gi and F in G2. 

* According to our assumptions on base sets, [si] n [/(T, . . . , T)] = 0. If S2 = /(. . .) then no 
rule corresponding to X > s, Y > t should appear in GihG2. 
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Proof 

Denote Gir\G2 by G. It is sufficient to show that if X =>^^^^^ t, Y u and 

pjn|ul 7^ then there exists a term w such that XhF wand |t]n|?i] C {w}. 

The proof is by induction on max(|t|, |u|) (where \s\ is the size of a term s). 

If i = T then X > T G Gi, Xr\Y =^*q^^-^ u and u is the required w. Similarly, w 
is t in the symmetric case of u = T. 

If none of u is T and one of them is a base symbol then the other is a base 
symbol or a constant. Two cases are possible: \t\ C rule XC\Y > t is in G and 
u; = or [i] D {u\ , Xr\Y >u&G and w = u. 

Otherwise t = /{h, . . . u = f{u\, . . . ,Un) (for some function symbol / of 
arity n > 0) and the considered derivations are X =^ /(. . .) ^* t and Y => 
/(. . .) =>* u. Grammar G contains a rule Xhy > f{Xi o Yi, . . . , X„ o y„) and G($) 
contains XnY > f{Zi, . . . , Zn), where Xi oYi = Zi unless o is a parameter. 
For each i = 1, . . . , n we have three cases. 

1. Xi o Yi is the variable Xi(lYi. Xi =>q^^^^ ti and Yi =^g2(*) CJl^arly, 
max(|ti|, \ui\) < max(|t|, \u\). By the inductive assumption there exists a term 
Wi such that Zi = XiCiYi Wi and [Uj n |iti] C fwij. 

2. X, o Fj is a parameter from Gi. Then =>* U both in Gi($) and G($). 

3. o is a variable from Gi or G2. Thus Zi ^* ti both in Gi($) and G($), 
or ^* both in G2($) and G($). 

This shows that for i — 1, . . . ,n there exists a Wi such that =^q(-^') Wi and n 
hil C M. Hence XhF ^^^^^ f{wi,...,Wn) and |i] n [«] C lf{wi,...,Wn)l 
□ 

Example 4-9 

Grammar Gi describes parametric non-empty lists and grammar G2 specifies lists 
of natural numbers: 

Gi : NEList > cons{a, List) G2 : ListN > nil 

List > nil ListN > cons{Nat, ListN) 

List > cons{a, List) 

Computing NEList Ci ListN gives a rule: 

NEList n ListN > cons{Nat, Listn ListN) 
The new variable Listn ListN is defined by the following rules: 
Listn ListN > nil 

Listn ListN > cons{Nat, Listn ListN) 
Thus we obtained a non-empty list of natural numbers as a result. 

4.4-3 Union 

Let Gi and G2 be PED grammars. We assume without loss of generality that they 
have no common variables, but they may have common parameters. We define an 
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operation U on such grammars; the result is a PED grammar G, denoted GiLJG2- 
The variables of G include the variables of Gi , the variables of G2 and new variables 
corresponding to pairs {X, Y) where X is a variable of Gi and F is a variable of 
G2. The latter will be denoted XilY. 

Now G consists of the rules of Gi UG2 and, for each XUY, of the rules constructed 
as follows. Let R = {t \ X > t e Gi or Y > t e G2} ■ If T e R then G contains 
XOY > T, otherwise: 

1. If /(si, . . . , Sn) € R {n > 0) and no other /(ii, . . . , i„) is in R then G contains 

xuy >/(si,...,s„). 

2. For each pair /(si, . . . , s„), /(ti, . . . , t„) of distinct elements of i? (n > 0),^ 
G contains X(jY > /(si o ti, . . . , Sn ° tn), where each Si o ti is 

• SiOti, if Si,ti are variables, 

• Si, if Si = ti and is a parameter, 

• a new variable V otherwise. In this case also rule ^ > T is in G. 

3. XijY > s is in G for each s G i? such that s is a constant or a base symbol 
and |s] 2 PI for any base symbol t E R, t ^ s. 

The result of the construction is a PED grammar. Its parameters (if any) may 
only originate from Gi and G2 . The construction is similar to that for discriminative 
term grammars. The union involving parameters is approximated by T unless both 
arguments are the same parameter. This is because we want the construction to 
approximate the union for all parameter valuations. 

Proposition ^.10 

For every parameter valuation <I> such that Gi(<I>) and G2(^') are parameterless 
grammars we have 

iUG2(*) 

for all variables X in Gi and F in G2. 
Proof 

Denote |-''^UF](3^uG2(*) ^y R- It is sufficient to show that if X ^^^^^^ s or 
Y =^G2(*) ^' s is ground, then |s] C R. We show this by induction on 

the derivation length. We can assume that the same renaming of the variables of $ 
has been used in constructing Gi(<i>), G2(<i>) and (GiUG2)($). 

Assume that V sq^*^ s, where V = X, H = Gi($) or V = Y, H ^ G2($). 
We have two cases. 

• So is a constant or base symbol (so sq = s). There is a rule XUY > s' in 
G1UG2 such that |sol C |s']. We have |s] C [s'j C R. 

• So = f{Xi, . . . ,Xn) (where n > 0), s = /(mi,...,u„) and Xi =4>^ Ui for 
each i — 1, . . . ,n. Grammar G1UG2 contains a rule XUY > T or XUY > 
f{Yi, . . . ,Yn). In the first case the result is immediate. In the second case the 
rule have been introduced by clause 1 or clause 2 of the definition of G1UG2. 

® Notice that for a given / at most two such elements exist. 
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In the case of clause 1, (GiUG2)(<&) contains XiiY > f{Xi, . . . ,Xn) and 
Yi — Xi whenever Yi is a variable. XlJY =^ f{Xi, . . . , X„) s is a derivation 
of (GiUG2)($), asH C (GiUG2)($), Hence |s] C i?. 

In the case of clause 2, each Yi is Sioti. If Siotj is SiU^i then s^, are variables, 
one of them is Xi and by the inductive assumption \ui\ C IsiUtiJj-Q^ijQ, ■)($), 
as Xi Ui. If Si o t j is a parameter then SiUt^ — Si — ti. In (GiUG2)(<&) 
this parameter is replaced by Xi. Notice that in this grammar Xi ^* Ui. The 
last possibility is that Si o t j is a variable W and > T is in G1UG2. 
So (GiUG2)($) contains a rule X(jY > /(ri, . . . , r„) where n — Si o ti and 
Kl C Irjl(GiuG2)(<i')' for i 1, ... ,71. Hence 

1*1 = [/("l: ■ • ■ ; ^ ■ • ■ 7 ''n)l(GiUG2)(*) lUG2)(*)- 

□ 

The requirement that Gi , G2 have no common variables is inessential when 
Gi = G2. This holds both for h and U and follows from the proofs of the last 
two propositions. 

Example 4^.11 

Consider the grammars from Example 4.9, Gi specifying parametric non-empty 



lists and G2 describing lists of natural numbers. 

Gi : NEList > cons{a, List) G2 : ListN > nil 

List > nil ListN > cons{Nat, ListN) 

List > cons{a, List) 

The rules defining NEList U ListN are 

NEList ListN > nil 

NEListi) ListN > cons{V, ListU ListN) 

V>T 

where is a new variable. There are similar rules for ListU ListN. 
ListU ListN > nil 

ListU ListN > cons(W, ListU ListN) 
W>T 



4-4-4 Generalized projection for parametric sets 

Let (Y, G) be a set descriptor, where G is a PED grammar, and t be a term. We 
are going to construct a PED grammar defining (a superset of) <^^(|i^]G(*))- 

We first construct a mapping ^(i, G, y) assigning to each subterm occurrence u 
in t a variable or a parameter Vu. Ki occurs in G or is a new variable Any. Mapping 
^{t, G, Y) has the following properties: 

1. Vt is Y. 

2. If M = f{ui, . . . , Un) {n > 0) and Ki is a parameter or Any then Vu-^ = . . . = 
Vu„ = Any. 

3. If M = /(wi, . . . , Un) {n > 0) and is a variable of G then 
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• K >/(Ki,...,Kj gG, or 

• Vu > b & G, where 6 is a base symbol, u e |6] and Vu^ — . . . — Vu^ — Any. 

(Notice that if n 7^ then b = T.) 

If ^{t, G, Y) exists then it is unique, because the grammar is discriminative, ^(f, G, Y) 
can be constructed by an obvious algorithm similar to that described in Section 
|2.1.2 , 

Proposition 4-12 

Let G be a PED grammar and G' — GLl{Any > T}. Let t be a term and X^, . . . , X'^ 
{k > 0) be the occurrences of a variable X in t. If (,{t, G, Y) exists then 

i 

for any parameter valuation $ such that G($) is parameterless. 

If ^{t,G,Y) does not exist or p|j lV^i]G'($) — for some variable Z oi t then 

Proof 

Consider a $ as above. Let H = G("I>) and H' = G'(<1'). 

We begin with showing the following property. Let c [| u be a constrained term 
and Vu be some variable or parameter of G'. If c [| u0 G IKiJh' then Vu satisfies the 
conditions for £^{t, G, Y) above (for some n, , . . . , Ki„)- 

Assume that u is not a variable (otherwise the conditions hold vacuously) and 
that cWud £ IVuJh'- For Vu being a parameter or Any the conditions trivially 
hold. Let Vu be a variable of G. We have Vu => J^/ s and cWu9 G |s] , where 
s = /(si, . . . , s„), u = /(mi, . . . , u„) and K =>h' /(-'^i, ■ • ■ , X^), or s is a base 
symbol and Vu s- Then a rule Vu > f{Xi, . . . ,X„), respectively Vu > s exists 
in G; the rule has the required properties. 

Now we show that if c [] tf? e {YJh then mapping S^{t,G,Y) exists and for any 
subterm u of t, c [| u0 G [KiJ/f. The latter is equivalent to existence of a ground 
term s such that Vu =>*Hf s and c [] 1*6* e |s] . 

The proof is by induction. Let m be a subterm of t and 

U = {u' \ u is a proper subterm of u' , u' is a subterm of t }. 

Assume that the required mapping exists on U . (So c \\ u'9 G |Ki'l_ff' for each u' gU 
and the conditions for ^(t, G, F) are satisfied.) We show that such a mapping exists 
for U U {u}. It is sufficient to show that c\\u9 G WuIh' , then it follows that Vu 
satisfies the conditions for ^(t, G, Y) from the property discussed above. 

li u — t then c [] 7/6* e |Kil_ff' obviously holds. Otherwise there exists a subterm 
u' = /(ui, . . . , M„) of t such that u = Ui for some i, and a ground term s' such that 
Vu' ^*H' s' and c w'6l e . 

If is a parameter or Any then Vu is Any and c [] w0 G [AnyJ^f/. The same 
reasoning is applicable when Vu' is a variable of G and > G G, as then 6 = T 
and y„ = Any. 

It remains to consider the case of Vu' being a variable of G such that 14' 
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/(Ki, . • • , KJ =^*H' s' - /(si, • • • , Sn). So K' > /(Ki , . . . , KJ e G and ^If- 
Si. From c Q /(ui, . . . , u„)6' G |s'] it follows that cWuO & |si] C IKlff' • This com- 
pletes the inductive proof. 

Thus if cWte e IY]h then cWX'O e fVx^H' for any occurrence of X in i. 
Hence 

cQX^^efll^X'lG'W and thus t-^drjcw) C f| lyx.lG'($). 

i i 

Notice that if ^(t, G, Y) does not exist or the intersection above is empty then 
cWte lYjH for any c, 9, and i^^l^ls = for any variable Z. □ 

The proposition suggests the following algorithm to compute a set descriptor 
t^'^{{Y, G)) giving an approximation of the set ^^^di^lcc*))- 

1. Compute ^(i,G,r). 

2. For each variable Z with the occurrences Z^, . . . , Z'^ in t, apply the intersec- 



tion algorithm for PED grammars (Section 4.4.2) to compute (an approxima- 
tion of) Plj [^Z'1g'(*)- This results in a grammar Gz = G' H . . . HG' and a 
variable Z' = h . . . h Z'^ such that Hi [Vz-lc'C*) ^ {Z'ja^i^) 

3. If i^(i, G, Y) does not exist or some Z' is nuUable in Gz then return t^^ {{Y, G) ) 
{V, 0) as the result (because t"^((F, G)) = 0, for any Z). 

4. Otherwise return t-^{{Y, G)) = {X', Gx) 

From the last proposition and the appropriate property of the grammar intersection 
operation it follows that if the algorithm produces G)) = {V,H) then 



4.4-5 Inclusion checking for parametric sets 

The algorithms for checking inclusion of the sets defined by discriminative term 
grammars can be generalized to extended parametric grammars. 

The problem is stated as follows. Let Gi and G2 be PED grammars. Let X 
be a variable in Gi and let F be a variable in G2. We want to check whether 
[^]gi($) ^ [^1g2(*) fo'" valuation $ such that Gi(<i>), G2(<i>) are parameterless. 
We will denote this fact by {X, Gi) C (Y, G2) (often abbreviated to X □ Y). 

We begin with introducing some notions. By G{X, Y) we mean the least set of 
pairs (of variables or parameters) such that 

• {X, Y) e C{X, Y) and 

• if {X',Y') e G{x,Y), x'>fiXi,...,Xn) e Gi and r'>/(ri, . . . , r„) e G2 

then {Xi,Yi), (X„, r„) e GiX, Y). 

An algorithm checking whether X \Z Y follows immediately from the following 
property and from finiteness of G(X, Y). 

Proposition 4-13 

Let Gi,G2 be PED grammars and X,Y be variables of, respectively, Gi,G2. As- 
sume that for each pair {V, W) G C{X, Y) 
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• if y is a parameter then V = W or rule > T is in G2, 

• if y is a variable then 

— for each rule V > f{Vi, . . . , Vn) € Gi (n > 1) there exists a rule W > 
/(...) G G2 or > T G G2, and 

— for each rule V > c G Gi, where c is a constant or base symbol, there 
exists a W > c' e G2 such that |c] C [c'J . 

Then X QY. 

The reverse implication holds provided Gi does not have nullable symbols, [c] ^ 
for each base symbol c, and if |c] C |W^]g2(*) for some base symbol or constant 
c and variable W of G2 then G2 contains a rule W > c' where |c] C |c]'. Intuitively, 
the last condition means that no set [c] is described by G2 by more than one rule. 

Proof 

Assume that the conditions are satisfied. For any {V, W) G C{X, Y) and any deriva- 
tion V =>Cj t, where t is a variable-free term, there exists a derivation W =>q^ u 
such that |t]Gi(4') ^ I"1g2(*) fo^ ^- This can be shown by induction on the 
structure of t. If a constrained term w is in [T^]gi(*) then w G PIgiC*) foi" some t 
as above. Hence w G P't^]G2(<i') ' which completes the "if" part of the proof. 

Assume that the conditions are not satisfied, for some pair (F, VF) G C{X,Y). 
We show that for some parameter valuation $ there exists a constrained term t 
such that t G |V^]gi(*) £^nd t ^ |M^]g2 (<!>)• We enumerate the possible cases, in each 
of them such <I> and t obviously exist. 

If y is a parameter then is a different parameter or a variable such that 
[^1g2(3>) 7^ [Tl- For V being a variable we have two cases. F >/(...) G Gi and 
noW > f(. . .) is in G2, or V > c G Gi and for each W > d G G2 |c| % {c\' , hence 
[c] n [c]' = (by our restrictions on base sets). 

Now it is easy to construct a u G |X]gi($) such that u ^ |^1g2($) by induction 
on the definition of C{X. Y) (on the number of applications of the second rule of 
the definition of C{X, Y) needed to show that {V, W) G G{X, Y)). □ 

We illustrate the check by a simple example. 

Example 4-14 

Gi : Y > cons{a, Z) 
Z > nil 

Z > cons{a, Y) 
We want to check the inclusion 

lYUw c mG2(*) 

for arbitrary parameter valuation $ such that Gi($) and G2($) are parameterless. 
For each pair of C{Y,X) the conditions from the proposition are to be checked. 
C{Y,X) contains {Y,X),{a,a),{Z,X). 

Consider {Y, X). For the rule Y > cons{a, Z) G Gi there exists X > cons{a, X) G 



G2 : X > nil 

X > cons{a, X) 
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G2- For {a, a) the check is immediate. For {Z,X), the foUowing pairs of rules are 
found to satisfy the conditions. 

Z > nil e Gi, X > nil e G2 
Z > cons{a, Y) £ Gi, X > cons{a, X) £ G2 

So the check is successfully completed. 



Set matching 

In our approach, a set of allowed calls of a polymorphic procedure will be specified 
by a set descriptor (Y, G) where G is a FED grammar. A particular call t is allowed 
if there exists a valuation of parameters $ such that t G |5^]g(*)- 

A set of actual calls may be described by another set descriptor (AT, H) , where 
H is SL FED grammar which has no parameter common with G. 

We want to be sure that all actual calls are allowed. As the specifications are 
parametric we have to refer to their instances. The question is then, whether for 
any valuation ^' of the parameters of H there exists a parameter valuation $ for 
G such that |Ar]/f(^) C Additionally we are interested in obtaining a 

possibly small set |i^]G(*)- We will call this a set matching problem. 

A solution can be obtained by a modification of the set inclusion algorithm dis- 
cussed above. In this extension the parameters of H are handled as constants while 
searching for such bindings of the parameters of G that the inclusion holds. 

For a given X, H and Y, G the matching algorithm constructs a parameter val- 
uation $ (possibly containing parameters from H) such that for any ^! for which 
is parameterless 

IAr]H(*) c |i^1g(*)(*)- 

(This is expressed as {X, H) C (F, G($)) in the notation of the previous section). 
To describe matching we recall how the inclusion algorithm works. Applied to 



X in H and Y in G, it checks the conditions of Proposition 4.13 for each pair 
(s, t) G G(Ar, Y). The difference with the matching algorithm is in the treatment of 
a (s,t) where t is a parameter (of G). In such case the inclusion algorithm answers 
"no" . In matching we want to instantiate the parameters of G so that inclusion 
holds. So in this case the matching algorithm binds the parameter i to s (which 
is a variable or a parameter). Notice that several different bindings for t may be 
produced since t may appear in several pairs in G. 

As G(Ar, Y) is finite, the checking terminates with failure or success. In the lat- 
ter case a set of bindings is produced. From these bindings we now construct a 
parameter valuation <&. This is done separately for each parameter a. Let {a 1— > 
si, . . . , a I— > Sk] {k > 1) be the set of bindings for a produced by the algorithm. 
The valuation <I>(a) is constructed by considering the following cases: 

• If A: = 1 then $(a) = {si,H). 

• If A: > 1 and all Si are variables of H, then ^{a) = (siU . . . Usfe, H\J . . . UH). 

• Otherwise A; > 1 and some Si is a parameter. Then ^{a) ~ {X, {X > T}) 
where X is a new variable. 
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Let ai, . . . , a„ be all the parameters of G that appear in C{X, Y). Applying the 
above stated rules to each of them we obtain <i> = { ai i-^ ^{ai), . . . , Q!„ i-^ }• 

This completes the description of the matching algorithm. It remains to show 
that if it succeeds then ^ [^]g(*)('I')) for any parameter valuation 5*. 

Assume that $(ai) = {Xi, Gi) (for i = 1, . . . , n) and that Xi was renamed into X[ 
while constructing G'($). We apply the inclusion checking algorithm to X,H and 
y, G'($) and compare its actions with those of the matching algorithm for X, H and 

y,G. 

Whenever the matching algorithm produces a pair (s,i) of two variables, the 
same pair is produced by the inclusion checking algorithm. Whenever the former 
produces an (s,aj) then the second produces (s,X-). Grammar Gi has been con- 
structed in such a way that |s]ff(*) C As |-'^ilG,(*) = K]g(*)(*) we 
have |s]fl'(^) C |^i']g(*)(*)- Hence for each pair (s,t) produced by the inclusion 
checking algorithm, |s]//(^) C |t]G(*)(*)- This completes the proof. 

Example 4.15 

This example illustrates set matching. The parametric grammars H and G specify 
different variants of lists with elements being triples. 

H : L > nil G : S > nil 

L > cons{T, L) S > cons{E, S) 

T>t{B,N,-f) E>t{a,a,(3) 
B > tt 
B>ff 
N > nat 

We want to match {L,H) and {S,G). We obtain C{L,S) = 
{{L, S), (T, E), {B, a), {N, a), (7, /?)}. The checks succeed with parameter bindings 

{a 1^ B, a N, P ^ j}. 

The result is the parameter valuation 

$ = { a {BilN, iJUiJ), P ^ (7, H) } 



5 Locating Program Errors with Parametric Specifications 

The call-success semantics discussed in Section || describes a program (together 
with its set of initial goals) by the set of calls and the set of successes. So the 
information about which successes correspond to which calls is lost. A more precise 
semantics can be given by replacing the set of successes by the set of pairs of a call 
and a corresponding success. 

A formalism of distributive grammars does not provide useful approximations of 
such semantics. If pairs {calli , success ^ ), {call2 , success2 ) are in such approximation 
then {calli , success2), {call2, successi ) are there too. Useful approximations can be 
however provided by parametric distributive grammars. With such a grammar one 
can specify a family of specifications. Correctness w.r.t. such a family means the 
following. Whenever a call is correct w.r.t. some specification from the family then 



42 



Wlodzimierz Drabent, Jan Maluszynski and Pawel Pietrzak 



any its success is correct w.r.t. this specification. Additionally, each call is correct 
w.r.t. some of the specifications. 

In this section we address the question of partial correctness of programs w.r.t. 
parametric specifications. First we state formally the problem and show that it can 
be re-formulated in terms of parametric set constraints. We show how to employ 
the constraints to check whether a program is correct w.r.t. a given specification 
and how to compute a specification for which the program is correct. Then we 
formalize the notion of error and discuss how the correctness checking procedure 
locates errors. 



5.1 Parametric specifications and program correctness 

By a parametric specification we mean a set of specifications.^ We are interested 
in specifications given by parametric grammars, this is however insignificant for 
the purposes of this section. Here we define the notion of correctness for such 
specifications and prove a sufficient condition for such correctness. 

Definition 5.1 

Let Spec be a parametric specification. A call c [| ^ in an LD-derivation is correct 
w.r.t. Spec if there exists some {Pre, Post) € Spec such that c [| A S Pre. A success 
c' W AO corresponding to a call c [] A is correct w.r.t. Spec if c' |] AO G Post, for any 
{Pre, Post) G Spec such that c [| A G Pre. 

A program P with a set of initial goals Q is correct w.r.t. Spec iff in any LD- 
derivation of P starting from a goal from Q all the calls and successes are correct 
w.r.t. Spec. A program P is correct w.r.t. Spec iff P with the set of initial goals 
[J{Pre I {Pre, Post) G Spec} is correct w.r.t. Spec. 

We impose following restrictions on parametric specifications. If {Pre, Post) is 
a member of such a specification then Pre, Post are closed under instantiation 
and Pre 3 Post.^ The correctness criterion from Proposition 3.1 can now be 
generalized. 

Theorem 5.2 

Let P be a CLP program, Q a set of atomic initial goals and Spec be a parametric 
specification. Let each {Pre, Post) G Spec respect constraints. A sufficient condition 
for P with Q being correct w.r.t. Spec is: 

1. For each clause H ^ Bi, . . . , Bn and any {PrcQ, Posto) G Spec there ex- 
ist {Prci, Posti), . . . , {Prcn, Postn) G Spec such that for j = 0, . . . ,n, any 
substitution and constraint c 

if cWHO G Preo, cWBiO E Posti, cWBjO E Postj 
then 

c W Bj+iO E Prcj+i, if j < 71 
cQ fl"6' G Posto, if j = n 

Remember that a (non parametric) specification is a pair of sets of (constrained) atoms. 
The lat ter condition is not essential. To abandon it, it is sufficient to replace each Posti in 
theorem 5.5 by Pre; fl Posti. 
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2. Each element of Q is in some Pre, such that (Pre, Post) e Spec. 



(As explained in Section 3.1, the restriction to atomic initial goals is not substan- 
tial). 

Proof 

Consider the i-th goal Qi of an LD-derivation starting from a goal Qo G G. We 
show that the call and the successes occurring in Qi are correct. The proof is by 
induction on i. li i = then Qi contains no successes and the call in Qi is obviously 
correct. 

Let i > 0. Consider the call in Qi. (The case of the goal containing no call is 
considered later on). Qi is of the form c [|(_Bj+i, . . . , Bn, A)t, where j < n, for some 
clause H ^ Bi, . . . , Bn of P, and the derivation is 

Qi„ = cqWA,A 

Q^^ = CoD(Bl,...,B„,A)0o 
Q^, = Ci Q(B2,...,S„,l)Ml 



Qij+1 = Cj D {Bj + l , . . . ,Bn,A)6o ■ ■ - Oj 

where ij^i = i, Oq ■ ■ ■ 0j ^ t and the call q_i [| BiOq ■ ■ ■ from a goal Qi, succeeds 
in the goal Qii^i (for I = 1, . . . The calls from Qi^, . . . , Qi. are correct, by the 
inductive assumption. So there exist (Prep, Posto), . . . , {Prcj, Postj) G Spec such 
that Co A e Preo and Ci_i [] BiOq ■ ■ ■ 9i-i £ Prei ior I — 1, . . . , j . 

Now we show that the successes of these calls are correct. This means 
Q Bi6o ■■■01 G Posti for Z = 1, . . . , j and for any {Preo, Posto), . . . , (Prej, Postj) 
as above. Notice that this includes the (Prei, Posti), . . . , (Prcj, Postj) from con- 
dition of the Theorem. 

The successes from Qi^, . . . ,Qij are correct by the inductive assumption. Also 
the success from Qi-_^_i of Cj_i [] BjOo ■ ■ - Oj-i is correct. To show this remove (the 
instances of) Pj+i, . . . , Bn, A from the goals of the derivation Qi. , Qi^+i , ob- 
taining a derivation to which the inductive assumption applies. (The derivation is 
shorter than i and starts from an atomic goal). Other procedure calls (from goals 
between Qi^ and Qij+i) may succeed in Qij_^j^. These successes are correct by the 
same reasoning. 

As all Prei, Posti are instance closed, we have Cj W At G Preo and Cj W Bit G 
Posti for I = 1, . . . , j . Moreover, At = Ht, as AOo — HOo- From condition 1 of the 
Theorem it follows that the call Cj [| Sj+iT is correct. 

It remains to consider the case when Qi does not contain a call. So Qi is of the 
form c W and the initial goal Qo succeeds in Qi. Let Qo = cq A. If A is a constraint 
then i = 1 and Qi = cq, A [|. As the specification respects constraints, the success in 
Qi is in Post whenever Qo G Pre and (Pre, Post) G Spec. If A is not a constraint 
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then we have a derivation as above, with j — n > (so Qi = Qi^^^), Qi„ being the 
initial goal (so ip = 0) and A being empty. Reasoning as previously we obtain that 
the premises of the implication in the Theorem hold. Hence c„ [| At — Cn Q Ht G 
Posto. As the choice of Preo was arbitrary, this holds for any (Preo, Posto) £ Spec 
such that Co [] A G Preo- So the success of cq A is correct. □ 

In our approach parametric specifications are given by parametric grammars. We 
assume that such a grammar G has two distinguished variables Call^ Success. The 
specification is then 

Spec — { (|CaZZ]c.(-^-), ISuccess^Qfip)) \ G($) is parameterless } . 

We require that each specification {Pre, Post) E Spec respects constraints. Addi- 
tionally we require that for each p such that Success >p{Y) e G, each parameter 
occurring in {p{Y))a occurs also in {p{X))g, where Call> p{X) G G. Informally, 
this means that procedure successes may only depend on those parameters on which 
the corresponding procedure calls depend. This assures that to each Pre there cor- 
responds exactly one Post such that [Pre, Post) G Spec. 

Each grammar providing a specification can be seen as consisting of two parts. 
One is fixed for a given programming language and specifies the semantics of con- 
straint predicates. The second is given by the user and describes the predicates 
defined by her program. Real CLP languages have built-in predicates, they can be 
treated by our method like constraint predicates. 



5.2 Correctness checking 



In this section we discuss checking the verification conditions of Theorem 5.2 with 
respect to a parametric specification given by a PED grammar. We generalize to 
such specifications the ideas of Section 3^ . 

Similarly as in the parameterless case, each implication from Theorem 5^ can 
be expressed by a system Fj{C) of constraints consisting of 

j 

X > H-^{Callo) n f]Bi-^ (Success,) (4) 
1=1 

(where C ~ H ^ Bi, . . . , Bn is the considered clause, < j < n and k ranges over 
the occurrences of X in the considered atom) for each variable X occurring in C, 
and of the 

Callj+i > Bj+i if i < n, . , 

Succcssq > H if j = n. 

So for the condition |l| from the Theorem to hold it is sufficient that for each 
choice of {Preo, Posto) G Spec there exist (Prei, Posti), . . . , {Prcn, Postn) G Spec 
such that each constraint system Fj{C) [j = 0, has a model / in which 

I{Calli) — PrCi, I{SuccesSi) — Posti, for i = 0, . . . ,n. 

Now assume that the specification is given by a parametric grammar G. A partic- 
ular {Preo, Posto) is given by a parameterless instance G{^) of G for some parame- 
ter valuation $: Preo = [Ca^Occ*); Posto = lSuccess}G{<s>)- For any such $ we are 
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looking for $1, ...,$„ describing, respectively, (Prei, Posti), . . . , {Pren, Postn)- 
As the latter depend on <I>, the grammars of $1, . . . , $„ may be parametric, with 
the parameters originating from grammar {Call)Q. $1, . . . , $„ should be chosen in 
such a way that for any <i>, each Fj{C) has a model / in which 



I{Callo) = lCalljc(<s>), I{Successo) = lSuccessjc{<]>), ,g^ 
I{Calli) — |Ca?Z]G($;)($), I{Successi) — |5'Mccess]G(*i)($) 

for i — 1, . . . , n. This can be done in the following way. 

Assume that $1, . . . , $j (0 < j < n) have already been found. We show how to 
check the j-ih implication of Theorem |3.2| and, if j < n, how to construct 
Let Go, . . . ,Gj be the grammars G, G('i>i), . . . , G($j) with the variables renamed 
apart such that 

1. Call, Success in G($i) are renamed into, respectively, C all i, Success i, for i = 
1, . . . , j, and Call, Success in G into Callo, Success^, 

2. no variable occurs in more than one grammar Gq, . . . ,Gj and no variable from 
clause G occurs in Go, . . . , Gj. 

Now Fj (G) U Go U . . . U Gj is to be converted into a discriminative grammar. For 
each variable X in the clause, constraint (Q) is transformed as described in Section 
by applying generalized projection and intersection operations from Section 



3_2 

4.4 



First for each A '^(Y) occurring in by generalized projection we ob- 
tain {Xa,Ga) such that ^"^(Ii^](GoU...uGj)(*)) Q l^AjcAm- (Notice that 
Y is Calli or Successi, thus |F1(GoU...uGj)(*) = I^lGi(*)-) Then the inter- 
section operation (followed by appropriate variable renaming) is applied to 
{Xh, Gh) , {Xbi , Gbi ),..., {Xbj , Gbj ) , resulting in {X, Gx) such that 

i 

Wgx(*) ^ -H""^(ICaM(GoU...uG,)(<E.)) n Pi S,;"^(I^MCcess,](Gou...uG,)(*)) 

1=1 

In this way we construct Gx for each variable A" of G. A renaming is ap- 
plied so that the variables of the constructed grammars Gx are distinct and 
Calli, . . . , Calln, Successi, . . . , SuccesSn do not occur in any Gx- Let G' = IJ^ Gx- 
Notice that G' is discriminative and that, for any $, the least model of (G' U Go U 
. . . U Gj)($) is a model of C = Fj{G) - {(|)} U (Go U . . . U Gj)($). 

Also, the constraint (|^) is converted into a discriminative grammar G" in an 
obvious way, as described in Section |3.2[ Each model of G" is a model of ^ , each 
model of (||) coincides with some model of G" on the variables of (||). 

Take an arbitrary $ (such that (Go U . . . UGj)($) is parameter less). Let J$ be the 
least model of C = (G) - { (|) } U (Go U . . . U G^ ) ($) . We have I* ( A) C |A] ^ ($) for 
any variable X occurring in G, and Fi{Y) = |i^]Gi(*) for Y being Calli or Successi, 
i ~ 1, . . . , j. Let us represent as K > A, where Y is Callj-^-i or Success^ and A is, 
respectively, B^+i or ff. It holds that |F1g'(<i.)uG" = MG'(<i>)u{(|)} = Mg'(<i.) ^ 
/*(A). 

If j = n then Y is Success^, A is and it remains to apply the inclusion 
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algorithm to check whether 

lSuccessojG'{<!>)uG" ^ lSuccessjG[<!>), 

for any $. If yes then /$ is a model of (|^) (as |5'Mccess](3($) = I<s,{Successo)), hence 
a model of Fj{C). It has the required properties, as holds for i = 1, . . . ,n. 
li j < n then Y is Callj^i, A is -Bj+i and 'I'j+i has to be constructed. We have 
C |Ca/Zj+i]G/($)y(3//, for any Now we apply the set matching operation 
of Section 4.4 to obtain such that for any $ 

|CaZ/j+i]G'(*)uG" ^ ICa^^]G(*j+i)(<i>)- 

Take an interpretation such that I'^{Callj+i) — |C'a^^]G(*j+i)($)i 

I'^{Successj+i) = |5'Mccess]G'($j+i)(j) and = /<i.(y) for any other variable 

V. For any $, is a model of (||) (as /<i.(i3j+i) C I'^{Callj+i)) and hence of 
Fj(C) U (Go U . . . U Gj)($). It also fulfills the requirements (|) for i = 1, . . If 
the set matching fails, then the program is not found to be correct. 

Computing ^j+i (or, in the case oi j = n, performing the inclusion check) com- 
pletes the iteration step for j. The reasoning above provides a proof for: 

Lemma 5.3 

If the process described above succeeds producing $i, . . . , then the condition 
0. from Theorem 5.2 is satisfied, for clause C and the parametric specification given 



by the parametric grammar G. 



If the clause does not satisfy the condition of the Theorem 5.2 then the process 
of checking is bound to fail. The reverse is not true. The correctness checking of 
a correct program may fail, due to the fact that the employed intersection and 
projection operations for parametric grammars are approximate. 

Due to similarity of this correctness checking algorithm to that described in 



Section 3.2, we expect that its complexity is the same. 



Example 5.4 

Consider the following clause, a part of the "Slowsort" program: 

slowsort(L,S) :- perm(L,S), sorted(S) . 

For this clause we have the following three systems of constraints (we abbreviate 
slowsort as s, perm as p and sorted as sd): 

Fq: L> s{L,Sy^{Callo) 
S > s{L,Sy^{Callo) 
Calk > p{L, S) 

Fi : L > s{L,S)-^{Callo) n p{L, S)-^{Successi) 
S > s{L,S)-'^{Calk) n p{L,S)-'^iSuccessi) 
Calh > sd{S) 

F2 : L > s{L,S)-^{Callo) n p{L, S)'^ (Success i) n sd{S)-^{Success2) 
S > s{L,S)-^{Callo) n piL,S)-'^ {Success i) n sd{S)-'^ {Success2) 
Successo > s{L,S) 
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A specification is provided by the following parametric grammar G: 



Call > s{ListN, Any) 
Call > p{List, Any) 
Call > sd{ListN) 
ListN > [] 

ListN > [Nat\ListN] 



Success > s(ListN, ListN) 
Success > p{List, List) 
Success > sd{ListN) 
List > [] 
List > [a\List] 
Any > T 



Nat > nat 



The first step of checking the correctness of the clause w.r.t. the specifica- 
tion deals with Fq. First one uses generalized projection operation to compute 
s{L,S)-^{{Calla,Gy) = {ListN, G) and s{L, S)''^ {(Callo^G)) = {Any,G}. Wc 
may informally say that the first two rules of Fq have been transformed into 
L > ListN, S > Any. 

Then Gl and Gs are respectively {ListN)c and {Any)G (the subsets of G defin- 
ing ListN and Any), with the variables appropriately renamed. Their union is 



The grammar G" is just the last rule of Fq. Matching {Calli,G'UG") C {Call, G) 
succeeds after checking the pairs {Call i. Call), {L, List), {S, Any), {Nat' , a). The 
result is 4>i = {a i-^ {Nat', C U C") }. So the first implication of the verification 
condition is satisfied, provided that {Call\, Success\) is defined by (j($i) (after an 
appropriate variable renaming). 

We briefly outline the remaining two steps. Notice that the results of generalized 
projections from one step are also used in later steps. 

Dealing with Fi begins with computing two new generalized projections: 
p{L,S)-^{{Successi,Gi)) = {Listi,Gi) and p{L, S)-^ {{Successi,Gi)) = 
{Listi,Gi) , where Gi is a renamed G((f>i) and Listi is the renamed List. (We 
may informally say that the first two rules of Fx have been transformed into 
L > ListN n Listi , S > Any n Listi .) 

Then intersection operation is applied to approximate sets |ListAf](3($) H 
|izsii]Gi($) I^'^I/1g(*) n IiisiilGi($)j by grammars {ListN)Gr\Gi and 

{Any)G riGi. The grammars are renamed, so that ListN Ci List ^ becomes L and 
AnyHListi becomes S, resulting in G'. Matching {Call2,G' U {Call2 > sd{S)}) C 
{Call, G) does not involve any parameter and succeeds, so $2 = and G2 is G with 
variables renamed. 

Similarly, in the third step the projections related to atom sd{S) rcsiilt in 
{Any2, G2) and {ListN2, G2) ■ (We may informally say that the first two rules of F2 
have been transformed into L > ListN n Listi n Any 2, S > Any n Listi n ListN2.) 
Notice that lizstijci = \ListN2^G2 - G' obtained in this step is essentially the same 
as that in the previous one - the sets |L] and [S*] that G' defines arc the same as 
in the previous step. The inclusion check succeeds, which completes checking that 
the clause is correct. 



G': 



L>[] 

L > [Nat'\L] 



5 > T 



Nat' > nat 
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5.3 Computing parametric specifications 

Now we show how to compute a parametric specification approximating the seman- 
tics of a given program. 

Consider a parametric specification Spec. Notice that if the verification conditions 



of Proposition 3.1 hold for each (non parametric) specification from Spec then the 
conditions of Theorem ^ hold, with {Preo, Posto) — . .. — {Prcn, Postn)- Thus 
the program is correct w.r.t. Spec. We will use this fact in constructing parametric 
specifications for a given program. The initial goals are described by a parametric 



grammar Gq. Go also describes the constraint predicates, similarly as in Section 3.3 
We are going to construct a parametric grammar G (with the parameters from Gq) 
such that whenever the initial call is from IGa/IJg^^^), all the calls and successes 
are from |Ga/Z](3($), |5'uccess]G'($), respectively. 



To compute G we proceed as in the parameterless case (Section 3.3). The only 
difference is that the algorithm is now applied to parametric grammars. We require 
that the description of constraint predicates is parameterless. So whenever a rule 
Call > p{Yi, . . . , Yn) or Success > p(Yi, . . . , 1^), where p is a constraint predicate, 
appears in Go then {Yi)Go does not contain any parameters (for i = l,...,n). 
Obviously, we require that the specification given by Go($) respects constraints. 



We employ the verification conditions of Proposition 3.1 expressed as the con- 
straint system C{P) (see Section [sT^ ) . For the grammar Go as above, C{P) is para- 
metric. C{P) = C U Go, where C is a set of parameterless constraints 

UU^'(^)- 

ceP j 

Consider a parameterless instance Go('&) of Go- If / is a model of C(P)(<1>) such that 
Spec = {I (Call), I (Success)) respects constraints then the verification conditions 
of Proposition |3.l| are satisfied, as shown in Section pT^ . 

Our goal is to construct a grammar G such that for any <i> (for which G($) is 
parameterless) there exists a model / of C(P)($) in which I (Call) = |GaZ/](3($) and 
I(Success) — |5'Mccess]G($). This implies that the verification conditions of Propo- 



sition 3.1 are satisfied for each specification (|Ga/Z]G'($), |S'Mccess]G($)). Hence the 



verification conditions of Theorem 5.2 are satisfied for the parametric specification 



{ (|Ga/Z](3($), I^Mccess] £■($)) | G($) is parameterless} 

given by grammar G, and the program is correct w.r.t. this specification. 

To obtain such a grammar we use the iterative procedure of Section |3.3| . It starts 
with Go and produces a sequence of grammars G^. Any parameter appearing in Gi 
occurs in Go. The description of the constraint predicates in any Gi is the same as 
in Go- The constructed grammars Gi have the following property, for any $ (such 
that Go($) is parameterless): The constraints C are satisfied if the occurrences of 
Call and Success in constraints (|^) (see Section 3.1) are valuated as in the least 



model of Gi($) and the occurrences of Call and Success in constraints (2) as in the 



least model of Gi+i ($). This follows from the discussion in Sections 3.2 



3.3, which 



can be repeated for the case of parametric grammars. The difference is that in the 
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parameter free case the operations of intersection and projection are exact while in 
the parametric case they are approximate. However the conclusions hold in both 
cases. In particular, if G', G" are constructed as in Section 3.2 then the least model 
of (G U G")(*) is a model of J^- i(c) U G($) and the least model of (G' U G")($) is 
a model of ^},i(c) (for ^ assigning parameterless grammars to the parameters 
ofG). 

As discussed in Section 3^ it is necessary to apply some technique for enforcing 
termination while computing fixpoints. As discussed there our prototype implemen- 
tation uses for that purpose an adaptation of a technique of ( Mildner, 1999|) , which 
extends also to the parametric case. 

Now Gi is the required grammar. For any $ as above there exists a model J of 
C which coincides with the least model of Gi($) on Call and Success. An inter- 
pretation / in which the variables of C are valuated as in J and the variables of 
Go, except of Call, Success, as in the least model of Go(<i>), is the required model 
of C(P)(<i>). As explained above, if such model exists then the program is correct 
w.r.t. the parametric specification given by Gi. 

We derive a somehow restricted kind of parametric specifications. Whenever the 
initial goal is in |Ga^/]G($), all the calls and successes of the computation are, 
respectively, in |Gan]g($) , ISuccessJci'S') ■ Thus our approach is unable to construct 
such parametric specifications that various usages of a predicate in a program are 
described by different instances of the parametric specification. 



5.4 Error detection 

The purpose of error diagnosis is to locate the errors in the program. By errors we 
mean those program fragments that are the reasons that the program is incorrect 
w.r.t. a given specification. For the semantics chosen in this work, the incorrectness 
means that some call or success in some computation of the program violates the 
specification. Such calls or successes will be called error symptoms. A pragmatic 
requirement is that the errors found are as small program fragments as possible. 

In traditional approaches, debugging begins with symptoms, obtained from exe- 
cuting the program on some test data. Obviously, only a finite subset of (usually) 
infinite set of test data can be used. In our approach symptoms are not needed. At 
the expense of restricting the class of specifications to types defined by paramet- 
ric discriminative grammars, program correctness can be checked automatically. A 
successful check is a proof that the program is correct. Equivalcntly, if the program 
is incorrect then the check fails; moreover from the correctness checking algorithm 
we can obtain information locating the errors. 



Our correctness checking algorithm uses the sufficient condition of Theorem 5.2 
The condition consists of n -I- 1 implications for each n-ary clause of the program 
(and an obvious condition on the initial atomic goals). Each implication concerns a 
prefix H <— Bi, . . . , Bi of a, clause H ^— Bi, . . . , Bn (1 < i < ".).0 Two implications 
concern the whole clause (i = n). If the program is incorrect then some of the 

^■^ In the notation of Theorem [3.^ , i = j-\-li{j<n and i = n ii j = n. 
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implications do not hold. The clause prefixes corresponding to these implications 
will be considered the errors of the program. 

Definition 5.5 

Let P be a program and Spec a parametric specification. An error m P (w.r.t. Spec) 
is a prefix H ^ Bi, . . . , Bk+i (0 < fc < n — 1) of a clause H ^ Bi, . . . , i?„ of P, or 
the whole clause H ^ Bi, . . . , Bn (then k = n) such that for some (-Preo, Posto) G 
Spec and for each {Prei, Posti), . . . , {Prck, Postk) £ Spec such that the implica- 
tion of Theorem holds[^ for j = 0, . . . , fc — 1, there exists a substitution 9 and 
constraint c such that c [| H9 G Preo, cWBi9 G Posti, . . . , c [| 5^6' G Postk and 

c Bk+i9 ^ Prck+i for any {Prck+i, Postk+i) G S'pec, if fc < n, 
cQiJ6' ^ Posto, ifk^n. 

We say that the representative of the error is Bk+i when 0<fc<n — 1, oriJ 
when k ^ n. (So it is the atom whose instance is found incompatible with the 
specification) . 

This definition formalizes the intuition of a program fragment being the reason 
of incorrectness. Such fragments have to be changed in order to obtain a correct 
program. On the other hand, in a general case there are no semantic criteria to 
state what in such a fragment has to be changed. In this sense the errors defined 
above are minimal. What is "the error" from the pragmatic point of view, depends 
on the programmer's intentions about the exact intended semantics of the program. 

Example 5.6 

Consider a type specification 

Call > m{Any, L) L > [] Any > T 

Success > m{a,L) L > [a\L] 

and a clause m(X, [Y,Z] ) :-m(X, Z). The (prefix being the) whole clause 
is incorrect w.r.t. the specification, as for j — the second argument of the call 
m{X, Z)9 is, speaking informally, of type a instead of L. We cannot state which 
atom of the clause is erroneous. To obtain a correct clause one may for instance 
replace m{X, [Y, Z]) by m{X, [Y\Z]), or m{X, Z) by m{X, [Z]). Only knowing that 
m is intended to define a list membership relation, makes it possible to decide what 
is the actual error (w.r.t. the (exact) intended semantics of the program). 



Notice that there is at most one error in a given clause, as Definition 5^ requires 
that the implications for j = 0, . . . , fc — 1 hold. Thus according to our definition each 
proper prefix of an error is not an error. The reason is that if <— . . . , Pj+i, 
< J < fc, were an error then we would not have a criterion which (Pre^+i, Postj^i) 
to consider in determining that <— Bi, . . . , B^+i is an error .|^ 

We will use the correctness checking procedure from the previous section to locate 

This means that for any substitution and constraint c 

if c He G Preo, c Bid G Posti, • ■ ■ , c Q BjB g Postj 
then c Bj+i9 g Prsj+i 

Such a criterion may be obtained by setting Postj^i = [_j{Post | (Pre, Post) g Spec}. 
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errors in programs. If a clause contains an error then the procedure wih fail. The 
reverse is not true, correctness checking of a clause not containing an error may fail, 
due to approximation inaccuracies of the intersection and projection operations. 

The correctness checking procedure finds each clause containing an error. More- 
over, to a certain extent a clause prefix containing the error is located. If <i>i, . . . , $j 
are successfully constructed then each prefix H ^ Bi, . . . ,Bi, for i — 1, . . . , j is 
not an error. If then constructing of ^j+i fails, it is possible that some of prefixes 
H <— Bi, . . . , Bi, where j > j, is an error. If no approximation inaccuracies had 
appeared then 7f <— . . . , -Bj+i would have been an error. The inaccuracies make 
it possible that some larger prefix is an error or the clause does not contain an error. 



6 The prototype diagnosis tool 

6. 1 The structure of the tool 

We implemented a prototype tool that locates errors by checking correctness of a 
program wrt types specified by PED grammars. Notice that such a grammar may 
or may not include parameters. As already mentioned in the Introduction, the tool 
consists of three main components: 

• the type inferencer - for a given program and parametric entry declaration 
constructs parametric directional types of the program using the technique of 
Section The types approximate the program semantics. 

• the type checker - checks correctness of a program wrt to given parametric 



directional types using the technique of Section 5. 2 
• the specification editor - a GUI which makes it possible to specify intended 
directional types and also to inspect and to re-use in this specification the 
inferred types. 

A diagnosis session starts with type inference. The inferencer may issue some 
warnings about illegal calls to built in predicates. It happens if the inferred call 
type for a built-in is not a subtype of the expected one. The expected call types for 
built-ins are stored in the system library and may be viewed as a part of specification 
given a priori. 

The main part of the session consists in providing/editing by the user a specifi- 
cation of the intended types. The type checker works interactively with the editor. 
Each verification condition is checked as soon as a sufficient fragment of a spec- 
ification is provided. The diagnosis relies entirely on the provided types. It does 
not involve execution of the program and it does not use the inferred types. The 
role of type inference is auxiliary. As mentioned above, the inferencer may discover 
certain irregularities in the program and its warnings suggest starting points for 
the diagnosis. On the other hand, the inferred types may be used as a draft for the 
specification; this simplifies the task of constructing the specification by the user. 



We expect however that the definition modified in such way would define errors which do not 
correspond to an intuitive notion of an error. 
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The current version of the tool supports a substantial subset of the CHIP lan- 
guage. It can be easily modified to be used with any Prolog-like language. The 
prototype has been implemented in SICStus Prolog. A more detailed description of 
our tool, in its version for parameter less specifications, together with an example 
error diagnosis session is given in (Drabent et ai, 2000a). 



6.2 Types 

The parametric specifications used by the tool are PED-grammars defined in Sec- 



tion O. For every parameter valuation such a grammar defines a set of constrained 
terms. A parametric type defined by such a grammar can be seen as a family of 
sets (of constrained terms). 

In the implementation we use the notation as shown in the example below. We 
write 

:-typedef tree — > nil; t (elem, tree , tree) 
to denote the grammar 

Tree > nil 

Tree > t{Elem, Tree, Tree) 

Such a grammar may be a part of a program. 

The present version of the tool uses four base types: 

• any denotes |T] , 

• nat denotes the set of natural numbers, 

• anyf d denotes the set of constrained atoms of the form x G FD [| x where FD 
is a finite domain, i.e. a finite set of natural numbersP^ 

• int denotes the set of integers. 

The approach to base types in the implementation does not satisfy Requirement 
^.13 . Namely, sets denoted by sinyf d and int are neither disjoint nor one of them 



includes the other. This design choice remained from the previous versions of our 
approach. It is dealt with by some ad hoc modifications of the grammar operations. 
It will be changed, by adding a base type neg of negative numbers and defining the 
set of integers as the union of fnatj and fnegj . 

The type of a top call for a program is provided with entry declaration, for 
instance: 

:- entry delete(list(A) ,A,cLny) . 

Parameters are identifiers written with capital letter (like variables in Prolog) . Thus 
the above declaration says that we intend to delete an element of an arbitrary type A 
(the second argument) from the list of elements of that type (the first argument). 
The third argument is supposed to be a variable on call, which can be only expressed 
as any. 

To make the system interface more user-friendly we introduced a library of type 

We do not distinguish between c and x £ {c} [] x. 
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definitions which may be augmented by the user. It contains for instance, a para- 
metric grammar defining type list (A), i.e. hsts of elements of type A. 

Whenever possible, the types computed by the system are presented to the user 
in terms of those defined in the library or declared by the user. In this way the user 
faces familiar and meaningful type names instead of artificial ones. For instance, 
assume that the system has to display a type t77 together with the grammar rule 
t77 — > [] ; [t78 1 177] . Then it finds that they are an instance of the rules defining 
list (A) and displays list(t78) instead. 

When providing the specification the user gives intended call and success types 
for a given program. Formally this means providing grammar rules for Call and 
Success. So the grammar providing the specification consists of the rules kept in 
the library, the grammar rules given in : -typedef declarations of the program and 
the rules for Call and Success provided by the user during the diagnosis section. 



6.3 Inferring and checking types 



The type inference algorithm is based on the description of Sections 3.3 and 5.3 
It computes an approximation of call-success semantics of a given program. This is 
done by means of fixed point iteration. The algorithm is implemented in Prolog. 

For all programs used in our experiments (up to 230 clauses and 52 predicates) 
the prototype implementation computes approximations in reasonable timef^ 



As already mentioned in Section 3.3, in the parameterless case the algorithm can 
be seen a method of solving set constraints. However, the solution obtained is in 
general not the least one because of widening and of the approximate nature of 
the union operation which is used by the algorithm. Extension to the parametric 
case introduces additional loss of information caused by the operations on PED 



grammars discussed in Section 4.4 



The type inferencer is not able to find polymorphic dependencies between vari- 
ables by itself. The only parameters that may appear during the analysis are those 
provided by the user in the entry declaration. 



As discussed in Section the definitions of operations on PED grammars in- 
clude some arbitrary decisions. The union and the intersection of a type parameter 
with another type are, respectively, |T] and the other type. The implementation 
produces a warning whenever these situations appear during type inference. 

The rationale behind the warnings is as follows. The type parameter in call 
specification reflects the intuition that any instance of the parametric type is allowed 
at call. Normally it means that the analyzed procedure is polymorphic and it is 
supposed to work for any instance of the parameter. Thus the result of the analysis 
should be independent on potential instantiations of the parameter. In other words, 
none of the operation on types should touch parameters. If it happens then the 
procedure may not work as a polymorphic one. 

The type inference algorithm constructs call and success types of the predicates 

16 21.88 s in the worst case, running SICStus Prolog, ver. 3.8.4 on Sun-Ultra 10/440, with 440 
MHz CPU speed and 265 MB RAM. 
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defined by program clauses, thus computing an approximation of their call-success 
semantics. To be able to deal with real programs, it uses a library of type specifica- 
tions of built-in predicates. Similarly it is able to deal with fragments of programs 
(for instance with programs under development). In the latter case the user is re- 
quired to provide type descriptions for the undefined predicates. 

As already mentioned, the diagnosis relies on the type specification provided 
incrementally by the user. The specification process is supported by the possibility 
to accept some types constructed in the analysis phase as specified ones. This 
possibility is restricted to the types of the predicates relevant for the diagnosed 
predicate. Moreover, a heuristics is used to suggest to the user the order of specifying 
types. Following this order often results in fewer type specifications needed to locate 
an error. The user may stop the diagnosis with the first error message, which is 
often obtained without specifying all requested types. The diagnosis process may 
be continued by specifying all requested types. In this case, the tool will locate all 
incorrect clause prefixes in the fragment of the program relevant for the diagnosed 
predicate. 

An error message contains an incorrect clause. The incorrect prefix is indicated 
by referring to its representative (cf. Definition 5.5). The specification provided 
by the user is stored by the diagnoser and may be re-used during further diagnosis 
sessions. 



6.4 Examples 

Below we show some examples illustrating the use of the diagnosis tool. The exam- 
ples exhibit an advantage of parametric analysis over the non-parametric one. 
Consider the following erroneous program: 

append ([],Ys,Ys). 
append ( [Hi Xs] ,Ys, [H,Zs] ) :- 
append (Xs , Ys , Zs) . 

The head of the second clause should be append ( [H I Xs] , Ys , [H I Zs] ) . Assume that 
the append/3 predicate is supposed to concatenate two lists of any arbitrary type. 
In the non-parametric framework the best way to express such a type is list (any) . 
After analyzing the program with the following entry point declaration: 

:-entry appendClist (any) ,list(any) ,any) . 
the inferred success type is 

appendClist (any) , list (any) , list (any) ) 
The reason for inferring such a (success) type for the third argument of append/3 
is that the type of two-element list originating from the head of the second clause 
([H,Zs]) has been joined, by means of the upper bound operation, with the type 
list (any) coming from the recursive call of append/3. It results in the type 
list (any). Thus nothing suspicious can be concluded. 

On the other hand, if we provide a parametric declaration: 

: -entry append(list (A) , list (A) , any) . 
then the inferred success type does not meet our expectations: 
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append(list (A) .list (A) , list (any)) 
as we would rather wish to have list (A) as a result. Moreover, the analyzer warns 
us that the parameter A (originating from the success of the first clause and type 
list (A) of Ys) will be approximated by any while computing an upper bound 
with the type list (A) (originating from the second clause and the term [H,Zs], 
in which Zs is of type list (A) . 

After the user has specified the success type, the diagnoser locates the error and 
reports it by indicating its representative append ( [H|Xs] ,Ys, [H,Zs] ). 

The next example is a fragment of a job scheduling program. The fragment sets 
up precedence constraints among the jobs. A job is described by a term job(T,P) , 
where T is a starting time of processing the job and P is its duration. As T has to 
be found by the program it is a domain variable; P is fixed. The jobs are kept in a 
list and are identified by the position in it. 

The precedence between two jobs is represented as a term prec(Jl, J2), with a 
meaning: J2 cannot start before Jl has been completed. All such pairs are kept in 
the list. The precedence constraints are set up by the procedure precedences/2 
defined below. 

:-typedef tprec — > prec (nat ,nat) . 
:-typedef tjob — > jobCanyf d.nat) . 

: -entry precedences (list (tprec) .list (tjob)) . 

precedences ( []._). 

precedences( [prec(A,B) IPs] ,Jobs) :- 
get_nth(Jobs,A, job(TA,PA)) . 
get_nth(Jobs,B, job(TB._)) . 
TB #>= TA + PA. 
precedences (Ps, Jobs) . 

get_nth ( [_ I X] . 1 . X) : - ! . 7. bug here 
get_nth([_|Xs] ,N.X) :- 

Nl is N - 1, 

get_nth(Xs,Nl,X) . 

The : -typedef declaration defines new types used in the entry declaration. The 
first clause defining get_nth/3 contains a bug, as the first argument of its head 
should be [X |J . 

The inferred success type for precedences/2 is: 

precedences (t52 . list (t j ob) ) 
together with a definition of t52: 

t52-->[] 

This means that the procedure may succeed only when the precedence list is empty. 
If a diagnosis session is started with this predicate the user is asked to provide 
expected call and success types for get_nth/3. Assume they are respectively: 
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get_ntli(list (A) ,int,cLny) 

and 

get_nth(list (A) , int , A) 

After this step the diagnoser presents as an error the clause prefix pointed by the 
representative 

get_nth([_|X] ,1,X). 

The reason for the error message is that inclusion check of list (A) and A fails. 
Notice however, that in non-parametric framework the specification for get_nth/3 
could be get_nth(list (any) , int, any), both for calls and successes. In this case 
the inclusion check of list (any) and any would succeed, and the bug would not 
be discovered by the diagnosis. 



7 Discussion and Conclusions 



7.1 Related Work 



This work is directly related to: 



• the research on proving partial correctness of logic programs wrt call-success 
specifications, 

• the research on approximating semantics of logic programs by descriptive 
types based on set constraints and on abstract interpretation. 

It extends some of the techniques proposed in these fields to handle parametric 
polymorphism and constraint domains. 



Partial correctness. From (Bronsard et 



1992 



Apt, 1993; Bossi fc Cocco, 



1989) and our own previous work (Drabent & Maluszynski, 1988; Boye 



Maluszynski, 1997) we extend to CLP a directional view of logic programs in the 



sense that each predicate is considered a procedure which, when applied to a suit- 
able tuple of call arguments returns upon a success a tuple of computed values. 
This is formalized by the notion of call-success semantics. 



We rely on the proof methods of (Drabent & Maluszynski, 198S; Bossi & Cocco 



1989) for proving partial correctness of logic programs wrt call-success specification. 



We use their modification for CLP described in (Drabent et ai, 2000b; Drabent 



et 



2000a) and we extend them to deal with parametric specifications. For 
specifications formulated as definite set constraints ( Hcintze fc Jaffar, 1990a| ) P^ 



correctness can be effectively checked by reformulation of the verification condi- 
tions of the above mentioned methods, also as definite set constraints. As discussed 
in Section 3.1 such a reformulation requires specific operation called generalized 



projection, which is a special case of the "quantified set expression" of (Heintze 
fc Jaffar, 1994 ) and "membership expression" of ( Devienne et ai, 1997b| ; tlalbotl 



et ai, 20001) . For the reasons discussed in Section p.l.l| we choose as our specifi 



cation language a parametric variant of well-known formalism of discriminative 



Later studied also by (Oharatonik & Podclski, 1997) and (Talbot et al., 2001 
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regular term grammars|^ see e.g. (Dart & Zobel, 1992) additionally equipped with 
basic types for handling constrained terms and atoms of CLP. The same language 
is used for describing approximations of call-success semantics. Traditionally such 
approximations are called descriptive types of logic programs. 

Soundness of our method of type checking is stated by Lemma 3^ , which gives a 
sufficient condition for correctness of CLP programs for specifications given as term 
grammars. This result extends then for PED grammars. A recent paper (Comin: 



et al, 2000) argues that such sufhcient conditions for verification of Logic Programs 



can be systematically derived if the considered class of specifications is defined as an 
abstract interpretation domain with Galois connection relating them to a concrete 
semantics of logic programs. Unfortunately, as shown in ( Drabent fc Pietrzak, 1998| ), 
for our non-parametric specifications such a Galois connection does not exist so 
that it is not clear whether the method is applicable. 

Types in logic programming. We follow the descriptive typing approach where 
types approximate a posteriori the semantics of untyped programs. The early work 
on descriptive types (Mishra, 1984; Janssens fc Bruynooghe, 1992| ; Friihwirth et al 



1991; Yardeni & Shapiro, 1991) was based on the least model semantics. The prob- 



lems considered were how to check that the least model semantics is included in 
a regular set of terms (the type checking problem) and how to approximate it by 
regular sets (the type inference problem). The regular sets were defined by regular 
grammars or equivalently by regular unary logic programs ( Friihwirth et al., 1991 ). 
This approach does not take into account the intended use of the predicates and 
gives therefore a few possibilities for finding typing errors. The focus is mostly on 
detecting that for some predicates the inferred types are empty sets in which case 
the predicates never succeed. 



Checking of directional types based on set constraints was discussed in (Aiken & 



Lakshman, 1994). The types used are sets of non-ground terms. They are specified 



by set constraints together with a lifting function Sat that maps a set of ground 
terms to a set of nonground terms. Type checking is based on the same verification 



condition we use, which in general form originates from ( Drabent fc Maluszyhski, 



1988| ; iBossi fc Cocco, 1989|) and was specifically formulated for directional type 



checking in ( Apt, 1993 ). We also allow nonground types but in contrast to this 
work we achieve non-groundness not by lifting ground sets but by extending set 
constraints with constants interpreted as basic nonground types. 

Inference of directional types in the framework of set constraints was illustrated 
by an example in ( Hcintze, 1992|) . (The main topic of the paper are implementa- 
tion techniques for solving set constraints.) In the example the types are inferred 
by constructing set constraints analogous to our encoding of verification conditions, 
and solving them. A more recent work on inference of directional types for logic 



Such grammars define sets acceptable by deterministic root-to-frontier tree automata. Alterna- 
tively, the sets are called tuple-distributive or path-closed. 

The abstraction function does not exist, as there does not exist the best approximation of a 
given set of terms by a regular set of terms. This holds for both kinds of regular sets, those 
defined by discriminative and by arbitrary term grammars. 
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programs is (Charatonik fc Podclski, 1998|). It rephrases (as Theorem 1) the ver- 



ification conditions of (Apt, 1992) in model-theoretic setting^. Thus, a starting 
point for type inference are again the verification conditions used both in (Aiken 
& Lakshman, 1994) and in our work. In ( pharatonik fc Podelski, 1998 ) the direc- 
tional types are regular, but in general not discriminative. They are characterized 
by the least model of a uniform program constructed from the original program. 
The authors are not specific about the algorithms to be used for constructing a 
representation of the resulting directional types. In contrast to this work we do not 
construct uniform programs. We encode the verification conditions as set expres- 
sions. Directional types are models of these expressions. We restricted our attention 
to discriminative directional types. This made it possible to extend the type check- 
ing and type inference algorithms of ( Gallagher fc de Waal, 1994 ; Mildner, 1999| ), 
based on abstract interpretation, to the case of parametric directional types. 
Our work follows the idea of using semantic approximations for program verifi- 



cation and for locating errors presented in (Bueno et ai, 1997). This idea was also 
used for designing a generic preprocessor for validation and debugging of CLP pro- 



grams (Puebla et ai, 2000). The preprocessor verifies various assertions, provided 
by the user or inferred, in particular also non-parametric discriminative directional 
types similar to ours. 

While most of the papers on types in logic programming claim error detection 
as their objective, a little attention is usually devoted to locating errors. In this 



paper we extend our previous approach to locating errors (Drabent et ai, 2000b; 
Drabent et ai, 2000a) to the case of polymorphic types. As discussed in Section || 
this gives some more opportunities to locate the reasons of discrepancy between 
actual program and user expectations. 

Parametric Polymorphism in Logic Programming. Use of parametric poly- 



morphic types in logic programming was first suggested in (Mycroft fc O'Keefe 



1984). In this approach the function symbols and the predicates of a logic program 



are supposed to have a priori declared types. The types are used to restrict the syn- 
tax of the language to well-typed formulae. A compile-time test is then formulated 
which gives a sufficient condition that well-typedness is an invariant of goals in all 
computations. This approach to using types, called prescriptive typing has been 
followed in many papers and in several logic programming languages, most notably 
Godel (Hill & Lloyd, 1994) and Mercury ( Bomogyi et al, 1996| ). Semantically, pre- 
scriptive typing corresponds to taking many sorted typed logic as a foundation of 
logic programming, instead of untyped logic. Our approach is based on untyped 
logic and our parametric types approximate actual or intended semantics of the 
program. Thus, our work is in the framework of descriptive types, and the vast 
literature on prescriptive types is not further discussed here. Let us only mention 
some recent research on this topic ( pages fc Coquery, 2001 
Deransart & Smaus, 2001). 



Bmaus et ai, 2000 



These conditions are stated as magic transformation of the original program. 
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In the context of descriptive typing some preliminary ideas on the issue of para- 
metric polymorphism are discussed already in (Mishra, 1984) as a possible exten- 
sion of the presented type checking method for non-directional types. (Zobel, 1987) 
presents a method for deriving "syntactic" polymorphic types. These types are not 
directional. They are clearly related to term grammars but the paper does not ex- 
plain the relationship. Our techniques focus on directional types and are based on 
semantic considerations. 

Polymorphic directional types for logic programs discussed in (Boye, 1996) are 
based on the annotation method of (Deransart, 1993) for proving correctness of 
logic programs. This method is different from that used in our work and refers to 
a different semantics. In spite of that the verification conditions have a similar 
nature to ours and give rise to similar parametric set constraints. Our work goes 
further in that we use such parametric constraints in a sufhcient correctness test, 
and also for type inference, while the simplification techniques of ( Boye, 1996| ) are 
rather limited in handling parameters. 



The problem of polymorphic directional type checking is also addressed in (Rych- 



likowski & Truderung, 2000) and more recently in ( Rychlikowski & Trudcrung 



2001). This work presents a formal system, where directional well-typing of a logic 



program for given type specification is defined in terms of proofs constructed from 
given axioms and typing rules. This is different from our approach where the well 
typing algorithms are derived from the semantic concept of program correctness and 
types are understood as families of sets, specified by means of PED-grammars. Thus 
it seems impossible to compare our type checking algorithms with those discussed 
in ( [Rychlikowski fc Truderung, 2000| ). 



Nevertheless, the semantics of types as sets is also provided in ( Rychlikowski fc 



Truderung, 2000 ). It is done by a fixpoint construction, which for a given alphabet 



of typed function symbols associates each used type with a subset of the Herbrand 
universe. In this, rather indirect, way a similar effect is obtained as by our direct 
specification of types by means of PED-grammars. However, the class of the sets 
which can be constructed in that way is not precisely characterized. Syntactic re- 
strictions on the way of defining signatures seem to make it somewhat restricted. 
For example, it is impossible to have nonempty intersection of instances of different 
polymorphic types, e.g. [] cannot be used for representing both the empty list and 
the empty tree. This is a substantial restriction, e.g. one cannot define a type of 
even length lists. 

The soundness theorem of ( Rychlikowski fc Truderung, 2000 ) relates the direc- 
tional types of well-typed programs to their declarative semantics, while the types 
discussed here are related to the call-success semantics. Failure of our type checking 
algorithm locates potential errors in a fragment of a clause, while a proof failure of 
( Rychlikowski fc Trudcrung, 2000 ) seems to indicate a whole clause. (At least this 
issue is not discussed in that paper.) Handling of constraints is not discussed in their 
work, its main objective is representing different directional types of a predicate by 
one main type. 
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7.2 Conclusions 

We extended the concept of partial correctness for a logic program wrt to a direc- 
tional type to a concept of partial correctness of a CLP program wrt to a parametric 
directional specification. We formulated sufficient conditions for the correctness and 
we encoded them as set constraints. In this way we gave a semantic-based view of 
parametric polymorphism in constraint logic programs. 

We extended the notion of discriminative term grammar to the notion of para- 
metric extended discriminative term grammar (PED grammar). We argued that 
directional types specified by such grammars are quite useful. On one hand, they 
make it possible to describe simple approximations of program semantics, easy to 
provide and to understand by the user. On the other hand, they allow automatic 
check of the sufficient conditions mentioned above. Using these conditions one can 
also automatically infer parametric directional types from a parametric entry dec- 
laration. 

Our type inference techniques extend to CLP and to the parametric types the 
techniques of ( pallagher fc de Waal, 1994 ) corrected by Mildner ( |1999D ; they are 



based on abstract interpretation of logic programs. It seems possible to extend 
instead some of the set constraint solving techniques. This may be a topic of future 
work including also a comparison of both extensions. 

We developed a prototype tool implementing the proposed algorithms, which 



can be obtained from the third author. The theoretical result of (Charatonik 



Podelski, 1998) shows that the problem of checking discriminative directional types 
is not tractable, even in the parameterless case. The complexity of our type checking 
algorithm is exponential w.r.t. the maximal number of occurrences of a variable in a 
clause. However our tool turns out to be sufficiently efficient for practical purposes. 

Our tool supports a compile-time technique for error location based on checking 
directional parametric types. Clearly, the class of errors that can be located is 
restricted to type errors. The check locates those clause prefixes, which cause the 
type errors. Our approach does not impose any type discipline on the program. It 
does not require providing all type declarations in advance and often only a few 
declarations are sufficient to locate an error. The process of specifying declarations 
is supported by the possibility of inspecting and adopting the inferred types. 
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